Skip to content
This repository has been archived by the owner on Feb 1, 2023. It is now read-only.

Latest commit

 

History

History
45 lines (30 loc) · 3 KB

bbz2103.md

File metadata and controls

45 lines (30 loc) · 3 KB

COMS4995 Open Source Development

Update

The following tasks are now completed:

  • add unit tests (I used javascript mocha testing framework)
  • add automated ci workflow (I used github actions)
  • add linting (I used eslint airbnb-style-guides)
  • add code coverage (I used nyc code-coverage tool)

GitHub GitHub Workflow Status (branch) nycrc config on GitHub

Here is the link to the github repo https://github.com/Bruk3/j-parser

j-parser - translate any data serialization language into another.

  • Most npm parsers out there are only single purpose parsers such as:
    • js-yaml: parses yaml files into a json object and also dumps json objects into a yaml file
    • fast-xml-parser: parses xml files into a json object and dumps json objects into an xml file
    • table-to-json: parses html tables into a json object and vice versa.
  • js-universal-parser will be a wrapper around these libraries that exposes an api that allows a seemless conversion of one format to another. Starting with yaml and xml, through time, it'll be able to support conversions between more and more data serialization languages.

UPDATE: New Project

I have started working on a new project called obscenity-filter. Here is a proposal for the new project:

Here is my motivation for the new project. I am originally from Ethiopia, where Amharic is the official language of the country. A year ago, I remember doing a google search using the Amharic language and I noticed the autocompletion words that were suggested were extremely obscene. Even though Google supports searching in Amharic, it appears that it didn't have any feature where it filtered out profane words from the list that was generated by the autocompletion algorithm, thus exposing Amharic speakers to some extremely obscene suggestions.

What is it?

A javascript library that detects obscene Amharic words and phrases.

An obscenity filter library for the amharic langauge.

GitHub GitHub Workflow Status (branch) nycrc config on GitHub


A variety of public facing social websites process input texts from users in order to display it in public. These websites usually utilize a profanity filtering service in order to avoid exposing the public from extremely obscene phrases. Unfortunately, most of the filtering packages only have support for a few languages. Amharic, the official language of Ethiopia with more than 20 million speakers worldwide, has not been one of the supported languages. At least not until obscenity-filter.