Skip to content

pravshot/textov

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 

Repository files navigation

textov

Group Name: textov

Group members: Praveen Kalva spkalva3, Anushri Mittal anushri6, Akanksha Kumar kumar65, Arul Viswanathan arulv2

Project intro: Our project uses Markov chains to generate sentences from a text dataset. textov will read in user inputted text files and generate uniquely ordered text that resembles the words in the file. Uses parallelism and sparse matrices for efficient memory and runtime.

Goals:

  • Create a Markov Chain map off of the words in a text file
  • Be able to use that Markov Chain to generate sentences

Why we chose it:

  • We found markov chains interesting and we thought the application of it on text would be a fun project idea.

System Overview:

  • Read from text file
  • Cleaning the data
  • Create Markov Chain mapping using sparse matrix and data
  • Create stochaistic model using weighted probability from the sparse matrix to simulate/generate sentence
  • Repeat for more sentences
  • Add pararallelism/concurrency to speed up creating the markov chain
  • Eventually, add on higher order markov chains for different results (more random to more deterministic)
  • If we have time, create a web app/ui

Possible Challenges:

  • Dealing with errors when reading data and creating Markov chain
  • Using pararallelism without running into errors with shared memory and ownership
  • Edge cases with data formatting/cleaning

References:

About

Markov Chain Text Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages