Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Quick Start: Recurrent Models #133

Open
saddy001 opened this issue Apr 27, 2017 · 2 comments
Open

Error in Quick Start: Recurrent Models #133

saddy001 opened this issue Apr 27, 2017 · 2 comments

Comments

@saddy001
Copy link

Hi,

I think there's something wrong with the quick start example. I see rising accuracy but no real words:

sed for light, but only as an oi|ma;loaob1lu oh eoobol g oop"ebaoiple (55.3%)
downhill: RMSProp 170 loss=1.442173 err=1.442173 acc=0.552059
used for light, but only as an oi| luaiabafeoeflrbabnoahaao hreokbhhiaba e (55.2%)
downhill: validation 17 loss=1.446977 err=1.446977 acc=0.550113 *
downhill: RMSProp 171 loss=1.440421 err=1.440421 acc=0.552269
used for light, but only as an oi|iaoi -h.rbasop,lbea htpl?cbhiaaeb3eonylb (55.2%)
downhill: RMSProp 172 loss=1.439116 err=1.439116 acc=0.553969
used for light, but only as an oi|eaa.epatiboh,r? tbo rh ouoif;efetfeiu i (55.4%)
downhill: RMSProp 173 loss=1.443268 err=1.443268 acc=0.551297
used for light, but only as an oi|agoeea,eoswino-oaait oateerfaeraoeoa o (55.1%)
downhill: RMSProp 174 loss=1.438152 err=1.438152 acc=0.553594
used for light, but only as an oi|oa ghoia.aaaa0 am e b ,sbct;aoaoabo epa, (55.4%)
downhill: RMSProp 175 loss=1.432567 err=1.432567 acc=0.554312
used for light, but only as an oi|ofmaha,;orhooaaapebeohi!-e.hca pih mwhh (55.4%)
downhill: RMSProp 176 loss=1.433838 err=1.433838 acc=0.554353
used for light, but only as an oi|l ray a aiaal'a.btaea-ataaomhbabr,esal (55.4%)
downhill: RMSProp 177 loss=1.435675 err=1.435675 acc=0.553609
used for light, but only as an oi|;apb,e3eeibios ,aysta- ,;re ooadielai es (55.4%)
downhill: RMSProp 178 loss=1.439022 err=1.439022 acc=0.552759
used for light, but only as an oi|etnatio ail,b ; ulo pblh e,aboo,yibeey. (55.3%)
downhill: RMSProp 179 loss=1.432462 err=1.432462 acc=0.553709
used for light, but only as an oi|oeooneay noia.eaaoaioaeho.b ababb lsebna (55.4%)

In the example, meaningful words emerge at ~50% ACC. I had to make 2 small changes to the example: First, the corpus is compressed now at gutenberg, so I had to decompress it. Second, I had to change
seed = txt.encode(txt.text[300017:300050])
to
seed = txt.encode(txt.text[300015:300048])
To get the same sentence seed.

@lmjohns3
Copy link
Owner

Interesting, thanks for the report. I'll try to look into it in the next couple weeks. Please feel free to send a PR to fix up the compression and indexing issues if you like.

@saddy001
Copy link
Author

curl http://www.gutenberg.org/cache/epub/2701/pg2701.txt > corpus.txt
should be
curl http://www.gutenberg.org/cache/epub/2701/pg2701.txt |gunzip -c > corpus.txt
in the docs. The correct index can be seen in my comment above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants