tagged-transcript-processing

These scripts are designed to analyze transcripts processed by the parser found in the spanglish-pos-tagger repository.

Usage: place the transcripts to analyze into a directory called \properly_tagged_english, or \properly_tagged_spanish, depending on the language of the transcripts. Edit the LANGUAGE variable in main.py accordingly. The script files should be in the directory containing the \properly_tagged_x directory. Run the main.py file; the data should be saved to a file called x_output.csv.

If an input file is improperly formatted, main.py may terminate prematurely and print a number. This is the transcript number of the first transcript to throw an error. The misc.py file can be used to scan a batch of transcripts for such errors; the files to investigate should be placed in a directory called \all. The transcript number of any detected defective files will be printed out.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
defective		defective
README.md		README.md
al_extraction.py		al_extraction.py
analysis.py		analysis.py
data_writer.py		data_writer.py
extraction.py		extraction.py
main.py		main.py
misc.py		misc.py
noun_csv_generator.py		noun_csv_generator.py
tagged_cha_reader.py		tagged_cha_reader.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tagged-transcript-processing

About

Releases

Packages

Languages

Illinois-Linguistic-Data-Management/tagged-transcript-processing

Folders and files

Latest commit

History

Repository files navigation

tagged-transcript-processing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages