atom.xml

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Zhengyuan Zhu</title>
  
  
  <link href="/atom.xml" rel="self"/>
  
  <link href="https://824zzy.github.io/"/>
  <updated>2022-05-27T10:19:35.958Z</updated>
  <id>https://824zzy.github.io/</id>
  
  <author>
    <name>Zhengyuan Zhu</name>
    
  </author>
  
  <generator uri="http://hexo.io/">Hexo</generator>
  
  <entry>
    <title>Navigating Misinformation - How to identify and verify what you see on the web</title>
    <link href="https://824zzy.github.io/2022/05/26/self-directed-course-navigating-misinformation/"/>
    <id>https://824zzy.github.io/2022/05/26/self-directed-course-navigating-misinformation/</id>
    <published>2022-05-26T18:09:18.000Z</published>
    <updated>2022-05-27T10:19:35.958Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Overview"><a href="#Overview" class="headerlink" title="Overview"></a>Overview</h2><p>This article is a learning note of the self-directed course <a href="https://journalismcourses.org/course/misinformation/">Navigating Misinformation - How to identify and verify what you see on the web</a>.</p><p>The main purpose of this course is to help researchers learn how fact-checkers identify and verify online content, namely how responsible reporting works in an age of misinformation/disinformation.<br>The topics below are involved in this course:</p><ol><li>Discovery of problematic content</li><li>Basic verification of online sources</li><li>Advanced verification of online sources<ol><li>How date and time stamps work on social posts</li><li>How to geo-locate where a photo or video was taken</li><li>Tools overview to help determine the time in a photo or video</li><li>Verification challenge</li></ol></li></ol><h2 id="Discovery-of-problematic-content"><a href="#Discovery-of-problematic-content" class="headerlink" title="Discovery of problematic content"></a>Discovery of problematic content</h2><p>To keep track of the misleading claims and content, journalists are monitoring multiple social media. The first and foremost thing to figure out is <strong>what should be monitored - groups and/or topics</strong>, and what you choose will depend on the social platform. In general, journalists use Reddit, Facebook and Twitter as information sources.</p><h3 id="Information-sources"><a href="#Information-sources" class="headerlink" title="Information sources"></a>Information sources</h3><h4 id="Reddit"><a href="#Reddit" class="headerlink" title="Reddit"></a>Reddit</h4><p>Reddit is the eighth most popular website in the world even more popular than Twitter. <strong>Misinformation ends up circulating widely on Facebook and Twitter often appears on Reddit first</strong>. Reddit is made up of a collection of open forums called <strong>subreddit</strong>, the subreddit can be discovered through the general search page. Once you have found an interesting subreddit, you can search for its name to discover similar subreddits. Also, keep an eye out for new subreddits mentioned in the comments.</p><h4 id="Twitter"><a href="#Twitter" class="headerlink" title="Twitter"></a>Twitter</h4><p>There are two key ways to monitor Twitter activity: <strong>terms and lists</strong>.<br>The terms include keywords, domains, hashtags and usernames. More specifically, journalists focus on those websites and particular accounts that are likely to produce misleading content, and tweets that include certain keywords or hashtags, like “snowflakes” and “#Lockherup”. The <a href="https://developer.twitter.com/en/docs/api-reference-index">Twitter Search API</a> provides a powerful way to form a query. Below are the example of Twitter search operators:</p><p><img src="https://firstdraftnews.org/wp-content/uploads/2017/07/tweets_monitoring.png" alt="Twitter search operators"></p><p>On the other hand, using Twitter lists is another effective way of quickly putting together groups of accounts to monitor. The lists can be created by any Twitter user who is following a group of accounts as a unit. Journalists use Twitter lists to capitalize on the expertise of other journalists, however, Twitter hasn’t provided an API to easily search Twitter lists based on keywords. Thus, we have to utilize a Google hack to search through Twitter lists.<br>The hack is: for any topic keywords that you are interested in, add the query <code>site:twitter.com/*/lists [keywords]</code> in the google search bar. Google will return keyword-related public lists of all Twitter users. What’s more, by going to the list creator’s profile and clicking <code>More</code> and then <code>Lists</code>, you can find more lists that potentially attract you.<br>And by keep doing so recursively, you can combine the lists that you have found into a super list.</p><h4 id="Facebook"><a href="#Facebook" class="headerlink" title="Facebook"></a>Facebook</h4><p>The potential to monitor Facebook is narrower due to two reasons. First, the content available is designated public by users. Second, Facebook does not support direct, programmatic access to the public feed.</p><h3 id="Monitoring-Reddit-Facebook-and-Instagram-with-CrowdTangle"><a href="#Monitoring-Reddit-Facebook-and-Instagram-with-CrowdTangle" class="headerlink" title="Monitoring Reddit, Facebook and Instagram with CrowdTangle"></a>Monitoring Reddit, Facebook and Instagram with CrowdTangle</h3><p><a href="https://www.crowdtangle.com/">Crowdtangle</a> was made free after being acquired by Facebook. It takes search queries, groups and pages and creates custom social feeds for Facebook, Instagram, and Reddit. If you give it a search query, it creates a feed of posts from the platform that match that query. If you give it a list of accounts, it creates a feed of posts from those accounts.</p><p><img src="https://firstdraftnews.org/wp-content/uploads/2017/08/crowdtangle.png" alt="crowdtange user interface"></p><h3 id="Monitoring-Twitter-with-TweetDeck"><a href="#Monitoring-Twitter-with-TweetDeck" class="headerlink" title="Monitoring Twitter with TweetDeck"></a>Monitoring Twitter with TweetDeck</h3><p>By far, the easiest way to monitor multiple Twitter streams in real-time is TweetDeck. With Tweetdeck, you can arrange an unlimited number of real-time streams of tweets side-by-side in columns that can easily be cycled through.</p><p><img src="https://firstdraftnews.org/wp-content/uploads/2017/08/tweetdeck.png" alt="TweetDeck"></p><h2 id="Basic-verification-of-online-sources"><a href="#Basic-verification-of-online-sources" class="headerlink" title="Basic verification of online sources"></a>Basic verification of online sources</h2><p>When attempting to verify a piece of content, journalists always investigate five elements:</p><ol><li><p><strong>Provenance</strong>: verify if the content is original</p><p> If we are not looking at the original, all the metadata about the source and date will be wrong and useless. The journalists are facing the challenge that footage can easily jump from platform to platform or prevail inside a platform, thus we should always be suspicious about the content originality.</p></li><li><p><strong>Source</strong>: verify who created the content</p><p> Note that source means who captured the content instead of who uploaded the content. To verify the source, one can depend on two aspects: directly contact the user and check the user location and event location are the same.</p></li><li><p><strong>Date</strong>: verify when the content captured</p><p> Never assuming the content uploaded date is when the content was captured.</p></li><li><p><strong>Location</strong>: verify where the content captured</p><p> The geolocation can be easily manipulated on social media platforms, so it is better to double-check the location on a map or satellite image.</p></li><li><p><strong>Motivation</strong>: verify why the content captured</p><p> The user can be an accidental eyewitness or a responsible stakeholder.</p></li></ol><p>With the help of reversed image search tools such as <a href="https://images.google.com/">Google Images</a> and <a href="https://chrome.google.com/webstore/detail/reveye-reverse-image-sear/keaaclcjhehbbapnphnmpiklalfhelgf?hl=en">RevEye</a>, one can easily accomplish the verification.</p><h2 id="Advanced-verification-of-online-sources"><a href="#Advanced-verification-of-online-sources" class="headerlink" title="Advanced verification of online sources"></a>Advanced verification of online sources</h2><ol><li><p>Wolfram Alpha</p><p> Wolfram Alpha is a knowledge engine that brings available information from across the web. It has a powerful tool for checking the weather from any particular location on any date. When you are trying to <strong>double-check the date on an image or video</strong>, it can be very useful.</p></li><li><p>Shadow Analysis</p><p> Check if the shadow is in the right shape, right length and the same direction.</p></li><li><p>Geo-location</p><p> Only a very small percentage of social media posts are geo-tagged by users themselves. Luckily <a href="https://www.google.com/maps/">high-quality satellite and street view imagery</a> allows you to pace yourself on the map and stand in the place that the user was standing when they captured the footage.</p></li></ol><h2 id="References"><a href="#References" class="headerlink" title="References"></a>References</h2><ul><li><a href="https://firstdraftnews.org/articles/monitor-social-media/">How to begin to monitor social media for misinformation</a></li></ul>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;Overview&quot;&gt;&lt;a href=&quot;#Overview&quot; class=&quot;headerlink&quot; title=&quot;Overview&quot;&gt;&lt;/a&gt;Overview&lt;/h2
      
    
    </summary>
    
      <category term="Factchecking" scheme="https://824zzy.github.io/categories/Factchecking/"/>
    
    
  </entry>
  
  <entry>
    <title>Deecamp-28组AGI第一次沙龙分享活动纪要</title>
    <link href="https://824zzy.github.io/2019/07/21/AGI-salon-1/"/>
    <id>https://824zzy.github.io/2019/07/21/AGI-salon-1/</id>
    <published>2019-07-21T05:00:00.000Z</published>
    <updated>2019-07-22T11:20:38.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Deecamp-28组AGI沙龙活动纪要"><a href="#Deecamp-28组AGI沙龙活动纪要" class="headerlink" title="Deecamp-28组AGI沙龙活动纪要"></a>Deecamp-28组AGI沙龙活动纪要</h2><ul><li>日期：2019年7月21日周日晚上8点-10点30分</li><li>地点：国科大教一楼132</li><li>轮值主持人： 朱正源</li><li>沙龙主题: 量化投资模型分享及实践展示</li></ul><h2 id="沙龙内容"><a href="#沙龙内容" class="headerlink" title="沙龙内容"></a>沙龙内容</h2><h3 id="主持人宣讲"><a href="#主持人宣讲" class="headerlink" title="主持人宣讲"></a>主持人宣讲</h3><p><img src="https://user-images.githubusercontent.com/13566583/61606864-d7ecfc00-ac7e-11e9-9dac-80333cea971e.jpg" alt="agi-host"></p><h3 id="刘兆丰"><a href="#刘兆丰" class="headerlink" title="刘兆丰"></a>刘兆丰</h3><blockquote><p>分享了<strong>Using Deep Reinforcement Learning to Trade</strong>, 介绍了强化学习的发展历程，讲解了一种结合深度神经网络、递归神经网络和强化学习的量化交易决策模型，并展示了该模型在真实数据中的实验结果。<br><img src="https://user-images.githubusercontent.com/13566583/61606998-6bbec800-ac7f-11e9-8897-4da210ada6f3.jpg" alt="8061563762930_ pic_hd"><br><img src="https://user-images.githubusercontent.com/13566583/61607007-737e6c80-ac7f-11e9-814f-7a1fad39e1a6.jpg" alt="8071563762947_ pic_hd"></p></blockquote><h3 id="朱正源"><a href="#朱正源" class="headerlink" title="朱正源"></a>朱正源</h3><blockquote><p>分享了<a href="https://docs.google.com/presentation/d/1W7RGD3X_MZB3dfzaTrQdGYzv_zuay2JpYOTQUjXzK5A/edit#slide=id.g4461849552_8_1825">Introduction to Quantitative Investment with Deep Learning</a>, 分享了有关量化投资方向的股票预测模型，使用工业届常用的时间序列回归模型，通过预先构建多种假设，使用LSTM进行滑窗预测，介绍了量化交易中的真实情况，股市有风险，入市需谨慎～<br><img src="https://user-images.githubusercontent.com/13566583/61607014-7bd6a780-ac7f-11e9-801e-524c6cbea4c3.jpg" alt="8081563762955_ pic_hd"><br><img src="https://user-images.githubusercontent.com/13566583/61607019-7da06b00-ac7f-11e9-84ab-703ba1ad8525.jpg" alt="8091563762967_ pic_hd"></p></blockquote><h3 id="葛景琳"><a href="#葛景琳" class="headerlink" title="葛景琳"></a>葛景琳</h3><blockquote><p><strong>设计人员分享</strong>：介绍前期调研的几个阶段及调研目的，分享新产品项目推进的流程概况。</p></blockquote><p><img src="https://user-images.githubusercontent.com/13566583/61606932-23070f00-ac7f-11e9-903b-2874c5ec77fb.jpg" alt="8101563762973_ pic_hd"><br><img src="https://user-images.githubusercontent.com/13566583/61606934-269a9600-ac7f-11e9-8e03-d950207daf6f.jpg" alt="8111563762981_ pic_hd"></p><h2 id="集体合照"><a href="#集体合照" class="headerlink" title="集体合照"></a>集体合照</h2><p><img src="https://user-images.githubusercontent.com/13566583/61606966-4b8f0900-ac7f-11e9-876f-d2d8e60502ce.jpg" alt="8041563762837_ pic_hd"></p><h2 id="沙龙讨论内容"><a href="#沙龙讨论内容" class="headerlink" title="沙龙讨论内容"></a>沙龙讨论内容</h2><ol><li>在活动开始前建议调整设备</li><li>沙龙注重时间控制</li><li>不要拘束，不用师兄师姐的称呼，不要过分自谦</li><li>Demo展示具体形式需要等待产业导师就位再做定夺</li><li>下次沙龙暂定于下个没有课的晚上</li></ol><h2 id="特别鸣谢Deecamp全体人员对本沙龙的支持与帮助"><a href="#特别鸣谢Deecamp全体人员对本沙龙的支持与帮助" class="headerlink" title="特别鸣谢Deecamp全体人员对本沙龙的支持与帮助~"></a>特别鸣谢Deecamp全体人员对本沙龙的支持与帮助~</h2>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;Deecamp-28组AGI沙龙活动纪要&quot;&gt;&lt;a href=&quot;#Deecamp-28组AGI沙龙活动纪要&quot; class=&quot;headerlink&quot; title=&quot;De
      
    
    </summary>
    
      <category term="Quant" scheme="https://824zzy.github.io/categories/Quant/"/>
    
    
      <category term="agi" scheme="https://824zzy.github.io/tags/agi/"/>
    
      <category term="salon" scheme="https://824zzy.github.io/tags/salon/"/>
    
  </entry>
  
  <entry>
    <title>DeepInvestment introduction</title>
    <link href="https://824zzy.github.io/2019/07/12/DeepInvestment/"/>
    <id>https://824zzy.github.io/2019/07/12/DeepInvestment/</id>
    <published>2019-07-12T05:00:00.000Z</published>
    <updated>2019-07-13T17:26:49.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="The-slide-shows-all-you-need"><a href="#The-slide-shows-all-you-need" class="headerlink" title="The slide shows all you need"></a>The slide shows all you need</h2><p><strong>Whose money I want to make</strong>: Essentially according to Game Theory</p><p><a href="https://docs.google.com/presentation/d/1W7RGD3X_MZB3dfzaTrQdGYzv_zuay2JpYOTQUjXzK5A/edit#slide=id.g4461849552_8_1825">Introduction to Quantitative Investment with Deep Learning</a></p>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;The-slide-shows-all-you-need&quot;&gt;&lt;a href=&quot;#The-slide-shows-all-you-need&quot; class=&quot;heade
      
    
    </summary>
    
      <category term="AI" scheme="https://824zzy.github.io/categories/AI/"/>
    
    
      <category term="deepLearning" scheme="https://824zzy.github.io/tags/deepLearning/"/>
    
      <category term="quantitativeInvestment" scheme="https://824zzy.github.io/tags/quantitativeInvestment/"/>
    
  </entry>
  
  <entry>
    <title>Mutation test and Deep Learning</title>
    <link href="https://824zzy.github.io/2019/06/09/mutationtest-deeplearning/"/>
    <id>https://824zzy.github.io/2019/06/09/mutationtest-deeplearning/</id>
    <published>2019-06-09T05:00:00.000Z</published>
    <updated>2019-06-09T08:04:28.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Brief-Introduction-to-Mutation-Test"><a href="#Brief-Introduction-to-Mutation-Test" class="headerlink" title="Brief Introduction to Mutation Test"></a>Brief Introduction to Mutation Test</h2><blockquote><p>Mutation testing is a mature technology for testing data quality assessment in traditional software.<br>Mutation testing is a form of white-box testing.<br>Mutation testing (or mutation analysis or program mutation) is used to design new software tests and evaluate the quality of existing software tests. </p></blockquote><h3 id="Goal"><a href="#Goal" class="headerlink" title="Goal"></a>Goal</h3><p>The goals of mutation testing are multiple:</p><ul><li>identify weakly tested pieces of code (those for which mutants are not killed)</li><li>identify weak tests (those that never kill mutants)</li><li>compute the mutation score</li><li>learn about error propagation and state infection in the program</li></ul><h3 id="Example"><a href="#Example" class="headerlink" title="Example"></a>Example</h3><p>Selecting some <strong>mutation operations</strong>, and applying them to the source code for each executable code segment in turn. </p><p>The result of using a mutation operation on a program is called a mutant heterogeneity. </p><p><strong>If the test unit can detect the error (ie, a test fails), then the mutant is said to have been killed.</strong></p><h4 id="Foo"><a href="#Foo" class="headerlink" title="Foo"></a>Foo</h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">foo</span>(<span class="params">x: <span class="built_in">int</span>, y: <span class="built_in">int</span></span>) -&gt; <span class="built_in">int</span>:</span><br><span class="line">z = <span class="number">0</span></span><br><span class="line">If x&gt;<span class="number">0</span> <span class="keyword">and</span> y&gt;<span class="number">0</span>:</span><br><span class="line">z = x</span><br><span class="line"><span class="keyword">return</span> z</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">foo</span>(<span class="params">x: <span class="built_in">int</span>, y: <span class="built_in">int</span></span>) -&gt; <span class="built_in">int</span>:</span><br><span class="line">z = <span class="number">0</span></span><br><span class="line">If x&gt;<span class="number">0</span> <span class="keyword">and</span> y&gt;=<span class="number">0</span>:</span><br><span class="line">z = x</span><br><span class="line"><span class="keyword">return</span> z</span><br></pre></td></tr></table></figure><p>Given some test cases, we find that unit test cannot find variants<br><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Success</span><br><span class="line">assertEquals(<span class="number">2</span>, foo(<span class="number">2</span>, <span class="number">2</span>))</span><br><span class="line">assertEquals(<span class="number">0</span>, foo(<span class="number">2</span>, <span class="number">1</span>))</span><br><span class="line">assertEquals(<span class="number">0</span>, foo(-<span class="number">1</span>, <span class="number">2</span>))</span><br></pre></td></tr></table></figure><br>Add new tests to achieve the effect of eliminating variants:<br><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="literal">False</span></span><br><span class="line">assertEquals(<span class="number">0</span>, foo(<span class="number">2</span>, <span class="number">0</span>))</span><br></pre></td></tr></table></figure></p><h4 id="Bar"><a href="#Bar" class="headerlink" title="Bar"></a>Bar</h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">bar</span>(<span class="params">a: <span class="built_in">int</span>, b:<span class="built_in">int</span></span>) -&gt; <span class="built_in">int</span>:</span><br><span class="line">    <span class="keyword">if</span> a <span class="keyword">and</span> b:</span><br><span class="line">        c = <span class="number">1</span></span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        c = <span class="number">0</span></span><br><span class="line">    <span class="keyword">return</span> c</span><br><span class="line"><span class="comment"># Here is an mutation which operator is `and`</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">bar</span>(<span class="params">a: <span class="built_in">int</span>, b:<span class="built_in">int</span></span>) -&gt; <span class="built_in">int</span>:</span><br><span class="line">    <span class="keyword">if</span> a <span class="keyword">or</span> b:</span><br><span class="line">        c = <span class="number">1</span></span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        c = <span class="number">0</span></span><br><span class="line">    <span class="keyword">return</span> c</span><br></pre></td></tr></table></figure><p>Given a test case that will absolutely pass:<br><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">Success</span><br><span class="line">assertEquals(<span class="number">1</span>, foo(<span class="number">1</span>, <span class="number">1</span>))</span><br></pre></td></tr></table></figure><br>But we need to kill the mutation by adding more test cases:<br><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Failed:</span><br><span class="line">assertEquals(<span class="number">0</span>, foo(<span class="number">1</span>, <span class="number">0</span>))</span><br><span class="line">assertEquals(<span class="number">0</span>, foo(<span class="number">0</span>, <span class="number">1</span>))</span><br><span class="line">assertEquals(<span class="number">1</span>, foo(<span class="number">0</span>, <span class="number">0</span>))   </span><br></pre></td></tr></table></figure></p><h3 id="Inspriation"><a href="#Inspriation" class="headerlink" title="Inspriation"></a>Inspriation</h3><p>In deep learning, you can also create variants by changing the operators in the model. </p><p>Adding the idea of ​​the mutation test to the deep learning model, if the performance of the model after the mutation is unchanged, then there is a problem with the test set</p><p>It is necessary to add or generate higher quality test data to achieve the data enhancement effect.</p><h2 id="A-comparison-of-traditional-and-DL-software-development"><a href="#A-comparison-of-traditional-and-DL-software-development" class="headerlink" title="A comparison of traditional and DL software development"></a>A comparison of traditional and DL software development</h2><p><img src="https://ws1.sinaimg.cn/mw690/ca26ff18gy1g3uypojkkyj20z20kc77j.jpg" alt=""></p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li><a href="https://en.wikipedia.org/wiki/Mutation_testing">wiki: Mutation Testing</a></li><li><a href="https://www.testwo.com/article/869">突变测试——通过一个简单的例子快速学习这种有趣的测试技术</a></li></ul>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;Brief-Introduction-to-Mutation-Test&quot;&gt;&lt;a href=&quot;#Brief-Introduction-to-Mutation-Test
      
    
    </summary>
    
      <category term="AI" scheme="https://824zzy.github.io/categories/AI/"/>
    
    
      <category term="deepLearning" scheme="https://824zzy.github.io/tags/deepLearning/"/>
    
      <category term="mutationTest" scheme="https://824zzy.github.io/tags/mutationTest/"/>
    
  </entry>
  
  <entry>
    <title>Comparison of ON-LSTM and DIORA</title>
    <link href="https://824zzy.github.io/2019/05/31/ON-LSTM-and-DIORA/"/>
    <id>https://824zzy.github.io/2019/05/31/ON-LSTM-and-DIORA/</id>
    <published>2019-05-31T05:00:00.000Z</published>
    <updated>2019-06-09T04:50:29.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="ON-LSTM"><a href="#ON-LSTM" class="headerlink" title="ON-LSTM"></a>ON-LSTM</h2><p>Insprition under the hood: How to introduce grammar tree structure into LSTM in an unsupervised apporach.</p><h3 id="Introduction-Ordered-Neurons-ON"><a href="#Introduction-Ordered-Neurons-ON" class="headerlink" title="Introduction: Ordered Neurons(ON)"></a>Introduction: Ordered Neurons(ON)</h3><ol><li>The neurons inside ON-LSTM are specifically <code>ordered</code> to <code>express richer information</code>: Change the order of update frequency.</li><li>The specific order of neurons is to integrate the hierarchical structure (tree structure) into the LSTM, allowing the LSTM to <code>automatically learn the hierarchical structure</code>.</li><li><code>High/Low level information</code>: Should keep longer/shorter in corresponding coding interval.</li><li><code>cumax()</code>: A special function to internate special $F1$ and $F2$gate.</li></ol><h3 id="The-nuts-and-bolts-in-Mathematic"><a href="#The-nuts-and-bolts-in-Mathematic" class="headerlink" title="The nuts and bolts in Mathematic"></a>The nuts and bolts in Mathematic</h3><div class="row"><iframe src="https://drive.google.com/file/d/1UjxnKAcMtydDEr_-PuvVMRLK_2pjx80M/preview" style="width:100%; height:550px"></iframe></div><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ol><li><a href="https://kexue.fm/archives/6621">ON-LSTM：用有序神经元表达层次结构</a></li></ol>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;ON-LSTM&quot;&gt;&lt;a href=&quot;#ON-LSTM&quot; class=&quot;headerlink&quot; title=&quot;ON-LSTM&quot;&gt;&lt;/a&gt;ON-LSTM&lt;/h2&gt;&lt;p&gt;
      
    
    </summary>
    
      <category term="NLP" scheme="https://824zzy.github.io/categories/NLP/"/>
    
    
      <category term="LSTM" scheme="https://824zzy.github.io/tags/LSTM/"/>
    
  </entry>
  
  <entry>
    <title>Using Scheduled Sample to improve sentence quality</title>
    <link href="https://824zzy.github.io/2019/05/10/schedule-sampling/"/>
    <id>https://824zzy.github.io/2019/05/10/schedule-sampling/</id>
    <published>2019-05-10T22:15:30.000Z</published>
    <updated>2019-05-12T05:27:11.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><p><strong>Note that the author is not Yoshua Bengio</strong></p><h2 id="Overview"><a href="#Overview" class="headerlink" title="Overview"></a>Overview</h2><p>In Seq2Seq sequence learning task, using Scheduled Sampling can improve performance of RNN model.</p><p>The ditribution bewteen traning stage and evaluating stage are different and reults in  <strong>error accumulation question</strong> in evaluating stage. </p><p>The former methods deal with this error accumullation problem is <code>Teacher Forcing</code>.</p><p>Scheduled Sampling can solve the problem through take generated words as input for decoder in certain probability. </p><p>Note that scheduled sampling is only applied in training stage.</p><h2 id="Algorithm-Details"><a href="#Algorithm-Details" class="headerlink" title="Algorithm Details"></a>Algorithm Details</h2><p>In training stage, when generate word $t$, Instead of take ground truth word $y_{t<em>1}$ as input, Scheduled Sampling take previous generated word $g</em>{t-1}$ in certain probability.</p><p>Assume that in $i_{th}$ mini-batch, Schduled Sampling define a probability $\epsilon_i$ to control the input of decoder. And $\epsilon_i$ is a probability variable that decreasing as $i$ increasing.</p><p>There are three decreasing methods:<br>$$Linear Decay: \epsilon_i = max(\epsilon, (k-c)*i), where \epsilon restrict minimum of \epsilon_i, k and c controll the range of decay$$<br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2wav8w8fvj20hs0d841u.jpg" alt=""></p><p><strong>Warning</strong>:<br>In time step $t$, Scheduled Sampling will take $y_{t-1}$ according to $\epsilon<em>i$ as input. And take $g</em>{t-1}$ according to $1-\epsilon_i$ as input.</p><p>As a result, decoder will tend to use generated word as input.</p><h2 id="Implementation"><a href="#Implementation" class="headerlink" title="Implementation"></a>Implementation</h2><h3 id="Parameters"><a href="#Parameters" class="headerlink" title="Parameters"></a>Parameters</h3><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">parser.add_argument(<span class="string">&#x27;--scheduled_sampling_start&#x27;</span>, <span class="built_in">type</span>=<span class="built_in">int</span>, default=<span class="number">0</span>, <span class="built_in">help</span>=<span class="string">&#x27;at what epoch to start decay gt probability, -1 means never&#x27;</span>)</span><br><span class="line">parser.add_argument(<span class="string">&#x27;--scheduled_sampling_increase_every&#x27;</span>, <span class="built_in">type</span>=<span class="built_in">int</span>, default=<span class="number">5</span>,<span class="built_in">help</span>=<span class="string">&#x27;every how many epochs to increase scheduled sampling probability&#x27;</span>)</span><br><span class="line">parser.add_argument(<span class="string">&#x27;--scheduled_sampling_increase_prob&#x27;</span>, <span class="built_in">type</span>=<span class="built_in">float</span>, default=<span class="number">0.05</span>,<span class="built_in">help</span>=<span class="string">&#x27;How much to update the prob&#x27;</span>)</span><br><span class="line">parser.add_argument(<span class="string">&#x27;--scheduled_sampling_max_prob&#x27;</span>, <span class="built_in">type</span>=<span class="built_in">float</span>, default=<span class="number">0.25</span>,<span class="built_in">help</span>=<span class="string">&#x27;Maximum scheduled sampling prob.&#x27;</span>)</span><br></pre></td></tr></table></figure><h3 id="Assign-scheduled-sampling-probability"><a href="#Assign-scheduled-sampling-probability" class="headerlink" title="Assign scheduled sampling probability"></a>Assign scheduled sampling probability</h3><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># scheduled sampling probability is min(epoch*0.01, 0.25)</span></span><br><span class="line">frac = (epoch - opt.scheduled_sampling_start) // opt.scheduled_sampling_increase_every</span><br><span class="line">opt.ss_prob = <span class="built_in">min</span>(opt.scheduled_sampling_increase_prob * frac, opt.scheduled_sampling_max_prob)</span><br><span class="line">model.ss_prob = opt.ss_prob</span><br><span class="line"></span><br><span class="line"><span class="comment"># choose the word when decoding</span></span><br><span class="line"><span class="keyword">if</span> self.ss_prob &gt; <span class="number">0.0</span>:</span><br><span class="line">    sample_prob = torch.FloatTensor(batch_size).uniform_(<span class="number">0</span>, <span class="number">1</span>).cuda()</span><br><span class="line">    sample_mask = sample_prob &lt; self.ss_prob</span><br><span class="line">    <span class="keyword">if</span> sample_mask.<span class="built_in">sum</span>() == <span class="number">0</span>: <span class="comment"># use ground truth</span></span><br><span class="line">        last_word = caption[:, i].clone()</span><br><span class="line">    <span class="keyword">else</span>: <span class="comment"># use previous generated words</span></span><br><span class="line">        sample_ind = sample_mask.nonzero().view(-<span class="number">1</span>)</span><br><span class="line">        last_word = caption[:, i].data.clone()</span><br><span class="line">        <span class="comment"># fetch prev distribution: shape Nx(M+1)</span></span><br><span class="line">        prob_prev = torch.exp(log_probs.data)</span><br><span class="line">        last_word.index_copy_(<span class="number">0</span>, sample_ind,</span><br><span class="line">                                torch.multinomial(prob_prev, <span class="number">1</span>).view(-<span class="number">1</span>).index_select(<span class="number">0</span>, sample_ind))</span><br><span class="line">        last_word = Variable(last_word)</span><br><span class="line"><span class="keyword">else</span>:</span><br><span class="line">    last_word = caption[:, i].clone()</span><br></pre></td></tr></table></figure><h2 id="Result"><a href="#Result" class="headerlink" title="Result"></a>Result</h2><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2wdkxqmgvj20sg09ywnl.jpg" alt=""></p><h3 id="引用与参考"><a href="#引用与参考" class="headerlink" title="引用与参考"></a>引用与参考</h3><ul><li><a href="https://cloud.tencent.com/developer/article/1081168">【序列到序列学习】使用Scheduled Sampling改善翻译质量</a></li></ul>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;&lt;strong&gt;Note that the author is not Yoshua Bengio&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id=&quot;Overview&quot;&gt;&lt;a hre
      
    
    </summary>
    
      <category term="AI" scheme="https://824zzy.github.io/categories/AI/"/>
    
    
      <category term="videoCaptioning" scheme="https://824zzy.github.io/tags/videoCaptioning/"/>
    
  </entry>
  
  <entry>
    <title>basic_knowledge_supplement</title>
    <link href="https://824zzy.github.io/2019/05/06/basic-knowledge-supplement/"/>
    <id>https://824zzy.github.io/2019/05/06/basic-knowledge-supplement/</id>
    <published>2019-05-06T19:27:55.000Z</published>
    <updated>2019-05-12T05:27:06.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h1 id="Machine-Learning"><a href="#Machine-Learning" class="headerlink" title="Machine Learning"></a>Machine Learning</h1><h2 id="Basic-knowledge"><a href="#Basic-knowledge" class="headerlink" title="Basic knowledge"></a>Basic knowledge</h2><h3 id="Bias-and-variance"><a href="#Bias-and-variance" class="headerlink" title="Bias and variance"></a>Bias and variance</h3><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2rmgy3ib1j208s0dhtbb.jpg" alt=""></p><ol><li>Bias:</li></ol><p>Represent fitting ability, a naive model will lead to high bias because of underfitting.</p><ol><li>Variance</li></ol><p>Represent stability, a complex model will lead to high variance beacause of overfitting.</p><p>$$Generalization error = Bias^2 + Variance + Irreducible Error$$</p><h3 id="Generative-model-and-Discriminative-Model"><a href="#Generative-model-and-Discriminative-Model" class="headerlink" title="Generative model and Discriminative Model"></a>Generative model and Discriminative Model</h3><ol><li>Discriminative Model</li></ol><p>Learn a <code>function</code> or <code>conditional probability model P(X|Y)</code>(posterior probability) directly.</p><ol><li>Generative Model<br>Learn a <code>joint probability model P(X, Y)</code> then to calculate <code>P(Y|X)</code></li></ol><h3 id="Search-hyper-parameter"><a href="#Search-hyper-parameter" class="headerlink" title="Search hyper-parameter"></a>Search hyper-parameter</h3><ol><li>Grid search</li><li>Random search</li></ol><h3 id="Euclidean-distance-and-Cosine-distance"><a href="#Euclidean-distance-and-Cosine-distance" class="headerlink" title="Euclidean distance and Cosine distance"></a>Euclidean distance and Cosine distance</h3><p>Example: A=[2, 2, 2] B=[5, 5, 5] represents <code>two</code> review scores of <code>three</code> movie.<br>the Euclidean distance is $\sqrt{3^2 + 3^2 + 3^2}$, and the Cosine distance is $1$. As a result, Cosine distance can avoid of difference</p><p>After normalization, essentially they are the same,<br>$$D=(x-y)^2 = x^2+y^2-2|x||y|cosA = 2-2cosA，D=2(1-cosA)$$</p><h3 id="Confusion-Matrix"><a href="#Confusion-Matrix" class="headerlink" title="Confusion Matrix"></a>Confusion Matrix</h3><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2rmrezxi4j20wq042t9q.jpg" alt=""></p><ol><li>accuracy: $ACC = \frac{TP+TN}{TP+FN+FP+FN}$</li><li>precison: $P = \frac{TP}{TP+FP}$</li><li>recall: $R = \frac{TP}{TP+FN}$</li><li>F1: $F_1 = \frac{2TP}{2TP+FP+FN}$</li></ol><h3 id="deal-with-missing-value"><a href="#deal-with-missing-value" class="headerlink" title="deal with missing value"></a>deal with missing value</h3><ol><li>More missing value: drop feature column.</li><li>Less missing value: fill a value<ol><li>Fill outlier: <code>data.fillna(0)</code></li><li>Fill mean value: <code>data.fillna(data.mean())</code></li></ol></li></ol><h3 id="Describe-your-project"><a href="#Describe-your-project" class="headerlink" title="Describe your project"></a>Describe your project</h3><ol><li>Abstract reality to math problem</li><li>Describe your data</li><li>Proprocessing and feature selection</li><li>Model training and tuning</li></ol><h2 id="Algorithm"><a href="#Algorithm" class="headerlink" title="Algorithm"></a>Algorithm</h2><h3 id="Logistic-regreesion"><a href="#Logistic-regreesion" class="headerlink" title="Logistic regreesion"></a>Logistic regreesion</h3><h4 id="Defination"><a href="#Defination" class="headerlink" title="Defination"></a>Defination</h4><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2ro85d70dj209q017mwy.jpg" alt=""></p><h4 id="Loss-negative-log-los"><a href="#Loss-negative-log-los" class="headerlink" title="Loss: negative log los"></a>Loss: negative log los</h4><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2roe9667aj20ay04y0sp.jpg" alt=""></p><h3 id="Support-Vector-Machine"><a href="#Support-Vector-Machine" class="headerlink" title="Support Vector Machine"></a>Support Vector Machine</h3><h3 id="Decision-Tree"><a href="#Decision-Tree" class="headerlink" title="Decision Tree"></a>Decision Tree</h3><ol><li>ID3: use <code>information gain</code></li><li>C4.5: use <code>information gain rate</code><h3 id="Ensemble-Learning"><a href="#Ensemble-Learning" class="headerlink" title="Ensemble Learning"></a>Ensemble Learning</h3><h4 id="Boosting-AdaBoost-GBDT"><a href="#Boosting-AdaBoost-GBDT" class="headerlink" title="Boosting: AdaBoost GBDT"></a>Boosting: <code>AdaBoost</code> <code>GBDT</code></h4></li></ol><p>Seiral strategy, new learning machine is based on previous one</p><h4 id="GBDT-Gradient-Boosting-Decision-Tree"><a href="#GBDT-Gradient-Boosting-Decision-Tree" class="headerlink" title="GBDT(Gradient Boosting Decision Tree)"></a>GBDT(Gradient Boosting Decision Tree)</h4><h4 id="XGBoost"><a href="#XGBoost" class="headerlink" title="XGBoost"></a>XGBoost</h4><h4 id="Bagging-Random-forest-and-Dropout-in-Neural-Network"><a href="#Bagging-Random-forest-and-Dropout-in-Neural-Network" class="headerlink" title="Bagging: Random forest and Dropout in Neural Network"></a>Bagging: <code>Random forest</code> and <code>Dropout in Neural Network</code></h4><p>Parallel strategy, no dependency between learning machines.</p><h1 id="Deep-Learning"><a href="#Deep-Learning" class="headerlink" title="Deep Learning"></a>Deep Learning</h1><h2 id="Basic-Knowledge"><a href="#Basic-Knowledge" class="headerlink" title="Basic Knowledge"></a>Basic Knowledge</h2><h3 id="Overfitting-and-underfitting"><a href="#Overfitting-and-underfitting" class="headerlink" title="Overfitting and underfitting"></a>Overfitting and underfitting</h3><h4 id="Deal-with-overfitting"><a href="#Deal-with-overfitting" class="headerlink" title="Deal with overfitting"></a>Deal with overfitting</h4><ol><li><p>Data enhancement</p><ol><li>image: translation, rotation, scaling</li><li>GAN: generate new data</li><li>NLP: generate new data via neural machine translation</li></ol></li><li><p>Decrease the complexity of model</p><ol><li>neural network: decrease layer numbers and neuron numbers</li><li>decision tree: decrease tree depth and pruning</li></ol></li><li><p>Constrain weight:</p><ol><li>L1 regularization</li><li>L2 regularization</li></ol></li><li><p>Ensemble learning:</p><ol><li>Neural network: Dropout</li><li>Decision tree: random forest, GBDT</li></ol></li><li><p>early stopping</p></li></ol><h4 id="Deal-with-underfitting"><a href="#Deal-with-underfitting" class="headerlink" title="Deal with underfitting"></a>Deal with underfitting</h4><ol><li>add new feature</li><li>add model complexity</li><li>decrease regularization</li></ol><h3 id="Back-propagation-TODO-https-github-com-imhuay-Algorithm-Interview-Notes-Chinese-blob-master-A-E6-B7-B1-E5-BA-A6-E5-AD-A6-E4-B9-A0-A-E6-B7-B1-E5-BA-A6-E5-AD-A6-E4-B9-A0-E5-9F-BA-E7-A1-80-md"><a href="#Back-propagation-TODO-https-github-com-imhuay-Algorithm-Interview-Notes-Chinese-blob-master-A-E6-B7-B1-E5-BA-A6-E5-AD-A6-E4-B9-A0-A-E6-B7-B1-E5-BA-A6-E5-AD-A6-E4-B9-A0-E5-9F-BA-E7-A1-80-md" class="headerlink" title="Back-propagation TODO:https://github.com/imhuay/Algorithm_Interview_Notes-Chinese/blob/master/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80.md"></a>Back-propagation TODO:<a href="https://github.com/imhuay/Algorithm_Interview_Notes-Chinese/blob/master/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80.md">https://github.com/imhuay/Algorithm_Interview_Notes-Chinese/blob/master/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80.md</a></h3><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2ubtex9hwj207101kjr8.jpg" alt=""></p><blockquote><p>上标 (l) 表示网络的层，(L) 表示输出层（最后一层）；下标 j 和 k 指示神经元的位置；w_jk 表示 l 层的第 j 个神经元与(l-1)层第 k 个神经元连线上的权重</p></blockquote><p>MSE as loss function:<br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2ubuojv5xj207z03ljr9.jpg" alt=""></p><p>another expression:<br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2ubub5wrkj20e907y0sv.jpg" alt=""></p><h3 id="Activation-function-improve-ability-of-expression"><a href="#Activation-function-improve-ability-of-expression" class="headerlink" title="Activation function: improve ability of expression"></a>Activation function: improve ability of expression</h3><h4 id="sigmoid-z"><a href="#sigmoid-z" class="headerlink" title="sigmoid(z)"></a>sigmoid(z)</h4><p>$$\sigma(z)=\frac{1}{1+exp(-z)}, where the range is [0, 1]$$</p><p>the derivative of simoid is:<br>TODO: to f(x)<br>$$f’(x)=f(x)(1-f(x))$$</p><h3 id="Batch-Normalization"><a href="#Batch-Normalization" class="headerlink" title="Batch Normalization"></a>Batch Normalization</h3><p>Goal: restrict data point to same distribution through normalization data before each layer.</p><h3 id="Optimizers"><a href="#Optimizers" class="headerlink" title="Optimizers"></a>Optimizers</h3><h4 id="SGD"><a href="#SGD" class="headerlink" title="SGD"></a>SGD</h4><p>Stochastic Gradient Descent, update weights each mini-batch</p><h4 id="Momentum"><a href="#Momentum" class="headerlink" title="Momentum"></a>Momentum</h4><p>Add former gradients with decay into current gradient.</p><h4 id="Adagrad"><a href="#Adagrad" class="headerlink" title="Adagrad"></a>Adagrad</h4><p>Dynamically adjust learning rate when training. </p><p>Learning rate is in reverse ratio to the sum of parameters.</p><h4 id="Adam"><a href="#Adam" class="headerlink" title="Adam"></a>Adam</h4><p>Dynamically adjust learning rate when training. </p><p>utilize first order moment estisourmation and second order moment estimation to make sure the steadiness.</p><h4 id="How-to-deal-with-L1-not-differentiable"><a href="#How-to-deal-with-L1-not-differentiable" class="headerlink" title="How to deal with L1 not differentiable"></a>How to deal with L1 not differentiable</h4><p>Update parameters along the axis direction.</p><h3 id="How-to-initialize-the-neural-network"><a href="#How-to-initialize-the-neural-network" class="headerlink" title="How to initialize the neural network"></a>How to initialize the neural network</h3><p>Init network with <strong>Gaussian Distribution</strong> or <strong>Uniform Distribution</strong>.</p><p>Glorot Initializer:<br>$$W_{i,j}~U(-\sqrt{\frac{6}{m+n}}, \sqrt{\frac{6}{m+n}})$$</p><h1 id="Computer-Vision"><a href="#Computer-Vision" class="headerlink" title="Computer Vision"></a>Computer Vision</h1><h2 id="Models-and-History"><a href="#Models-and-History" class="headerlink" title="Models and History"></a>Models and History</h2><ul><li>2015 VGGNet(16/19):    Very Deep Convolutional Networks for Large-Scale Image Recognition, ICLR 2015.</li><li>2015 GoogleNet:    </li><li>2016 Inception-v1/v2/v3:    Rethinking the Inception Architecture for Computer Vision, CVPR 2016.</li><li>2016 ResNet:    Deep Residual Learning for Image Recognition, CVPR 2016.</li><li>2017 Xception:    Xception: Deep Learning with Depthwise Separable Convolutions, CVPR 2017.</li><li>2017 InceptionResNet-v1/v2、Inception-v4</li><li>2017 MobileNet:    MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, arXiv 2017.</li><li>2017 DenseNet:    Densely Connected Convolutional Networks, CVPR 2017.</li><li>2017 NASNet:    Learning Transferable Architectures for Scalable Image Recognition, arXiv 2017.</li><li>2018 MobileNetV2:    MobileNetV2: Inverted Residuals and Linear Bottlenecks, CVPR 2018.</li></ul><h2 id="Basic-knowledge-1"><a href="#Basic-knowledge-1" class="headerlink" title="Basic knowledge"></a>Basic knowledge</h2><h1 id="Practice-experience"><a href="#Practice-experience" class="headerlink" title="Practice experience"></a>Practice experience</h1><h2 id="Loss-function-decline-to-0-0000"><a href="#Loss-function-decline-to-0-0000" class="headerlink" title="Loss function decline to 0.0000"></a>Loss function decline to 0.0000</h2><p>Because of <strong>overflow</strong> in Tensorflow or other framework. it is better to initialize parameters in a reasonable interval. The solution is <strong>Xavier initialization</strong> and <strong>Kaiming initialization</strong>.</p><h2 id="Do-not-normaolize-the-bias-in-neural-network"><a href="#Do-not-normaolize-the-bias-in-neural-network" class="headerlink" title="Do not normaolize the bias in neural network"></a>Do not normaolize the bias in neural network</h2><p>That will lead to underfitting because of sparse $b$</p><h2 id="Do-not-set-learning-rate-too-large"><a href="#Do-not-set-learning-rate-too-large" class="headerlink" title="Do not set learning rate too large"></a>Do not set learning rate too large</h2><p>When using Adam optimizer, try $10^{-3}$ to $10^{-4}$</p><h2 id="Do-not-add-activation-before-sotmax-layer"><a href="#Do-not-add-activation-before-sotmax-layer" class="headerlink" title="Do not add activation before sotmax layer"></a>Do not add activation before sotmax layer</h2><h2 id="Do-not-forget-to-shuffle-training-data"><a href="#Do-not-forget-to-shuffle-training-data" class="headerlink" title="Do not forget to shuffle training data"></a>Do not forget to shuffle training data</h2><p>For the sake of overfitting</p><h2 id="Do-not-use-same-label-in-a-batch"><a href="#Do-not-use-same-label-in-a-batch" class="headerlink" title="Do not use same label in a batch"></a>Do not use same label in a batch</h2><p>For the sake of overfitting</p><h2 id="Do-not-use-vanilla-SGD-optimizer"><a href="#Do-not-use-vanilla-SGD-optimizer" class="headerlink" title="Do not use vanilla SGD optimizer"></a>Do not use vanilla SGD optimizer</h2><p>Avoid getting into saddle point</p><h2 id="Please-checkout-gradient-in-each-layer"><a href="#Please-checkout-gradient-in-each-layer" class="headerlink" title="Please checkout gradient in each layer"></a>Please checkout gradient in each layer</h2><p>For the sake of potential gradient explosion, we need to use <strong>gradient clip</strong> to cut off gradient</p><h2 id="Please-checkout-your-labels-are-not-random"><a href="#Please-checkout-your-labels-are-not-random" class="headerlink" title="Please checkout your labels are not random"></a>Please checkout your labels are not random</h2><h2 id="Problem-of-classification-confidence"><a href="#Problem-of-classification-confidence" class="headerlink" title="Problem of classification confidence"></a>Problem of classification confidence</h2><p>Symptom: When losses increasing, but the accuracy still increasing</p><p>For the sake of <strong>confidence</strong>: [0.9,0.01,0.02,0.07] in epoch 5 VS [0.5,0.4,0.05,0.05] in epoch 20.</p><p>Overall, this phenomenon is kind of <strong>overfitting</strong>.</p><h2 id="Do-not-use-batch-normalization-layer-with-small-batch-size"><a href="#Do-not-use-batch-normalization-layer-with-small-batch-size" class="headerlink" title="Do not use batch normalization layer with small batch size"></a>Do not use batch normalization layer with small batch size</h2><p>The data in batch size can not represent the statistical feature over whole dataset。</p><h2 id="Set-BN-layer-in-the-front-of-Activation-or-behind-Activation"><a href="#Set-BN-layer-in-the-front-of-Activation-or-behind-Activation" class="headerlink" title="Set BN layer in the front of Activation or behind Activation"></a>Set BN layer in the front of Activation or behind Activation</h2><h2 id="Improperly-Use-dropout-in-Conv-layer-may-lead-to-worse-performance"><a href="#Improperly-Use-dropout-in-Conv-layer-may-lead-to-worse-performance" class="headerlink" title="Improperly Use dropout in Conv layer may lead to worse performance"></a>Improperly Use dropout in Conv layer may lead to worse performance</h2><p>It is better to use dropout layer in a low probability such as 0.1 or 0.2.</p><p>Just like add some noise to Conv layer for normalization.</p><h2 id="Do-not-initiate-weight-to-0-but-bias-can"><a href="#Do-not-initiate-weight-to-0-but-bias-can" class="headerlink" title="Do not initiate weight to 0, but bias can"></a>Do not initiate weight to 0, but bias can</h2><h2 id="Do-not-forget-your-bias-in-each-FNN-layer"><a href="#Do-not-forget-your-bias-in-each-FNN-layer" class="headerlink" title="Do not forget your bias in each FNN layer"></a>Do not forget your bias in each FNN layer</h2><h2 id="Evaluation-accuracy-better-than-training-accuracy"><a href="#Evaluation-accuracy-better-than-training-accuracy" class="headerlink" title="Evaluation accuracy better than training accuracy"></a>Evaluation accuracy better than training accuracy</h2><p>Because the distributions between training set and test set have large difference.</p><p>Try methods in transfer learning.</p><h2 id="KL-divergence-goes-negative-number"><a href="#KL-divergence-goes-negative-number" class="headerlink" title="KL divergence goes negative number"></a>KL divergence goes negative number</h2><p>Need to pay attention to softmax for computing probability.</p><h2 id="Nan-values-appear-in-numeral-calculation"><a href="#Nan-values-appear-in-numeral-calculation" class="headerlink" title="Nan values appear in numeral calculation"></a>Nan values appear in numeral calculation</h2><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li><a href="https://blog.csdn.net/LoseInVain/article/details/83021356">深度学习debug沉思录</a></li></ul>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h1 id=&quot;Machine-Learning&quot;&gt;&lt;a href=&quot;#Machine-Learning&quot; class=&quot;headerlink&quot; title=&quot;Machine Le
      
    
    </summary>
    
      <category term="AI" scheme="https://824zzy.github.io/categories/AI/"/>
    
    
      <category term="note" scheme="https://824zzy.github.io/tags/note/"/>
    
  </entry>
  
  <entry>
    <title>The first and second generation of Neural Turing Machine</title>
    <link href="https://824zzy.github.io/2019/05/05/Hybrid-computing-using-a-neural-network-with-dynamic-external-memory/"/>
    <id>https://824zzy.github.io/2019/05/05/Hybrid-computing-using-a-neural-network-with-dynamic-external-memory/</id>
    <published>2019-05-05T22:15:30.000Z</published>
    <updated>2019-05-12T05:27:13.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Hand-writing-pdf-version"><a href="#Hand-writing-pdf-version" class="headerlink" title="Hand-writing pdf version:"></a>Hand-writing pdf version:</h2><div class="row"><iframe src="https://drive.google.com/file/d/1H63IlKB8ekJWUO8rcKfGHYtBhZfRRnR4/preview" style="width:100%; height:550px"></iframe></div><h2 id="Structure"><a href="#Structure" class="headerlink" title="Structure"></a>Structure</h2><p>Computer has a CPU and a RAM.</p><p>Differential Neural Computer has a neural network as <strong>the controller</strong> that take the role of <strong>the CPU</strong>.<br>The memory is an $N <em> W$ <strong>matrix</strong> that take the role of <em>*the RAM</em></em>, where $N$ means the locations and $W$ means the length of each pieces of memory.</p><h2 id="Memory-augmentation-and-attention-mechanism"><a href="#Memory-augmentation-and-attention-mechanism" class="headerlink" title="Memory augmentation and attention mechanism"></a>Memory augmentation and attention mechanism</h2><blockquote><p>The episodic memories or evenet memories are known to depend on the hippocampus in the human brain.</p></blockquote><p>The main point is that the memory of the network is external to the network itself.</p><p>The attention mechanism defines some distributions over the $N$ locations.<br>Each $i-th$ component of a weighting vector will communicate how much attention the controller should give to the content in the $i-th$ location of the memory.</p><h2 id="Differntiability"><a href="#Differntiability" class="headerlink" title="Differntiability"></a>Differntiability</h2><p>Every unit and operation in this structure is differentiable.</p><h2 id="Weightings"><a href="#Weightings" class="headerlink" title="Weightings"></a>Weightings</h2><p>The controller wants to do something which involves memory, and it doesn’t just look at every location of the memor.<br>Instead, it focues its attention on those locations which contain the information it is looking for.</p><p>The weighting produced for an input is a distribution over the N locations for their relative importance in a particular process(reading or writing).</p><p>Note that the weightings are produced by means by a vector emitted by the controller, which is called <strong>interface vector</strong>. The </p><h2 id="Three-interactions-between-controller-and-memory"><a href="#Three-interactions-between-controller-and-memory" class="headerlink" title="Three interactions between controller and memory"></a>Three interactions between controller and memory</h2><p>The controller and memory are mediated by the <strong>interface vector</strong>.</p><h3 id="Content-lookup"><a href="#Content-lookup" class="headerlink" title="Content lookup"></a>Content lookup</h3><p>A particular set of values within the interface vector, which we will collect in something called key vector, is compared to the content of each location. This comparison is made by means of a similarity measure.</p><h3 id="Temporal-memory-linkage"><a href="#Temporal-memory-linkage" class="headerlink" title="Temporal memory linkage"></a>Temporal memory linkage</h3><p>The transitions between consecutively written locations are recorded in an $N * N$ matrix, called temproal link matrix “L”. The sequence by which the controller writes in the memory is an information by itself, and it is something we want to store.</p><p>DNC stores the ‘temporal link’ to keep track of the order things where written in, and records the current ‘usage’ level of each memory location.</p><h3 id="Dynamic-memory-allocation"><a href="#Dynamic-memory-allocation" class="headerlink" title="Dynamic memory allocation"></a>Dynamic memory allocation</h3><p>Each location has a usage level represented as a number from 0 to 1. A weighting that picks out an unused location is sent to the write head, so that it knows where to store new information. The word “dynamic” refers to the ability of the controller to reallocate memory that is no longer required, erasing its content.</p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li><a href="https://towardsdatascience.com/rps-intro-to-differentiable-neural-computers-e6640b5aa73a">Differentiable Neural Computers: An Overview</a></li><li><a href="https://deepmind.com/blog/differentiable-neural-computers/">Deepmind-&gt;Differentiable neural computers</a></li><li><a href="https://slideplayer.com/slide/14373603/">DNC-slide</a></li><li><a href="https://www.slideshare.net/databricks/demystifying-differentiable-neural-computers-and-their-brain-inspired-origin-with-luis-leal">Demystifying Differentiable Neural Computers and Their Brain Inspired Origin with Luis Leal</a></li></ul>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;Hand-writing-pdf-version&quot;&gt;&lt;a href=&quot;#Hand-writing-pdf-version&quot; class=&quot;headerlink&quot; t
      
    
    </summary>
    
      <category term="AI" scheme="https://824zzy.github.io/categories/AI/"/>
    
    
      <category term="agi" scheme="https://824zzy.github.io/tags/agi/"/>
    
  </entry>
  
  <entry>
    <title>Inspiration of On-intelligence</title>
    <link href="https://824zzy.github.io/2019/04/24/on-intelligence/"/>
    <id>https://824zzy.github.io/2019/04/24/on-intelligence/</id>
    <published>2019-04-24T22:15:30.000Z</published>
    <updated>2019-04-25T11:38:07.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><p>Please note that all these ideas may prove to be wrong or will be revised.</p><h2 id="Artificial-Intelligence-wrong-way"><a href="#Artificial-Intelligence-wrong-way" class="headerlink" title="Artificial Intelligence: wrong way"></a>Artificial Intelligence: wrong way</h2><p>We are on the wrong way of Artificial General Intelligence(AGI).</p><p>The biggest mistake is the belief that intelligence is defined by intelligent behavior.<br>Object detection or other tasks are the manifestations of intelligence not the intelligence itself.</p><p>The great brain uses vast amounts of memory to create a model of the world, everything you know and have learned is stored in this model.</p><p>The ability to make predictions about the future that is the crux of intelligence.</p><h2 id="Neural-Networks"><a href="#Neural-Networks" class="headerlink" title="Neural Networks:"></a>Neural Networks:</h2><ol><li>We must include time as brain function: real brains process rapidly changing streams of information.</li><li>The importance of feedback: In thalamus(丘脑), connections going backward toward the input exceed the connections going forward by almost a factor of ten. But <strong>back propagation is not really feedback</strong>, because it is only occurred during the learning phase.</li><li>Brain is organized as a repeating hierarchy.</li></ol><p>History shows that the best solution to scitific problems are simple and elegant.</p><h2 id="The-Human-Brain-all-your-knowledge-of-the-world-is-a-model-based-on-patterns"><a href="#The-Human-Brain-all-your-knowledge-of-the-world-is-a-model-based-on-patterns" class="headerlink" title="The Human Brain: all your knowledge of the world is a model based on patterns"></a>The Human Brain: all your knowledge of the world is a model based on patterns</h2><ol><li>The neocortex is about 2 milimeters thick and has six layers, each approximated by one card.</li><li>The mind is the creation of the cells in the brain. <strong>There is nothing else.</strong>And remember the cortex is built using a common repeated element.</li><li>The cortex uses the same computational tool to accomplish everything it does.</li></ol><p>According to <a href="https://en.wikipedia.org/wiki/Vernon_Benjamin_Mountcastle#Research_and_career">Mountcastle</a>‘s proposal:<br>The algorithm of cortex must be expressed independently of any particular function or sense.<br>The cortex does something universal that can be applied to any type of sensory or motor system.</p><blockquote><p>When scientists and engineers try to understand vision or make computer that can “see”, they devise terminologies and techniques specific to vision.<br>They talk about edges, textures, and three-dimensional representations.<br>If they want to understand spoken language, they build algorithms based on rules of grammar, syntax and semantics.</p></blockquote><p><strong>But these approaches are not how the brain solves these problems, and are therefore likely to fail.</strong></p><p>Attention mechanism:<br>About three times every second, your eyes make a sudden movement called a saccade.<br>Many vision research ignore saccades and the rapidly changing patterns of vision.</p><p>Existence may be objective, but the spatial-temporal pattern flowing into the axon bundles in our brains are all we have to go on.</p><h2 id="Memory"><a href="#Memory" class="headerlink" title="Memory"></a>Memory</h2><p>The brain does not “compute” the answers to problems, it <strong>retrieves the answers from memory</strong>.<br>The entire cortex is a memory system rather than a computer at all.</p><p>The memory is <code>invariant representations</code>, which handle variations in the world automatically.</p><ul><li><p>The neocortex stores <strong>sequences of patterns</strong><br>There are thousands of detailed memories stored in the synapses of our brains that are rarely used.<br>At any point in time we recall only a tiny fraction of what we know.(remind A-Z is easy, Z-A is hard)</p></li><li><p>The neocortex recalls patterns <strong>auto-associatively</strong>.<br>Your eyes only see parts of a body, but your brain fills in the rest.<br>At any time, a piece can activate the whole. This is the essence of auto-associative memories or inferring.<br><strong>Thought and memories are associately linked, notice that random thoughts never really occur!</strong></p></li><li><p>The neocortex stores patterns in an <strong>invariant form</strong>.<br>We do not remember or recall things with complete fidelity.<br>Because the brain remembers the important relationships in the world, independent of the details.<br>To make a specific prediction, the brain must combine knowledge of the invariant structure with the most recent details.</p><blockquote><p>When listening to a familar song played on a piano, your cortex predicts the next note before it is played. And when listening to people speak, you often know what they are going to say before they have finished speaking.</p></blockquote></li><li><p>The neocortex stores patterns in a <strong>hierarchy</strong>.</p></li></ul><h2 id="A-New-Framework-of-Intelligence-Hierarchy"><a href="#A-New-Framework-of-Intelligence-Hierarchy" class="headerlink" title="A New Framework of Intelligence: Hierarchy"></a>A New Framework of Intelligence: Hierarchy</h2><p>The brain is using memories to form predictions about what it expects to experience before experience it.<br>When prediction is violated, attention is drawn to the error.<br>Incorret predictions result in confusion and prompt you to pay attention.<br><strong>Your brain has made a model of the world and is constantly checking that model against reality.</strong></p><p>By comparing the actual sensory input with recalled memory, the animal not only understands where it is but can see into the future.</p><h2 id="How-the-Cortex-Works"><a href="#How-the-Cortex-Works" class="headerlink" title="How the Cortex Works"></a>How the Cortex Works</h2><p>If you don’t have a picture of puzzle’s solution, <strong>the bottom-up method</strong> is sometimes the only way to proceed.</p><p>Here is an interesting metaphor: </p><blockquote><p>Many of puzzle pieces will not be used in the ultimate solution, but you don’t know which one or how many.</p></blockquote><p>I can not approve the ideas from Hawkins in this part. Still we don’t know how the cortex works actually.</p><h2 id="How-the-Cortex-Learns"><a href="#How-the-Cortex-Learns" class="headerlink" title="How the Cortex Learns"></a>How the Cortex Learns</h2><blockquote><p>Donlad O.Hebb, Hebbian learing: When two neurons fire at the same tiem, the synapses between them get strengthened</p></blockquote><ol><li>Forming the classifications of patterns.</li><li>Building memory sequences.</li></ol><p>Note that prior to neocortex, the brain has:</p><ol><li>The Basal ganglia(基底神经节): Primitive motor system.</li><li>The cerebellum(小脑): Leared precise timing relationships of evenets.</li><li>The hippocampus(海马体): stored memories of specific events and places.</li></ol><p><strong>The hippocampus is the top region of the neocortex, not a separate structure.</strong></p><p>There are many more secrets to be discovered than we currently know</p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li><a href="https://en.wikipedia.org/wiki/On_Intelligence">On intelligence, Jeff Hawkins</a></li><li><a href="https://en.wikipedia.org/wiki/Vernon_Benjamin_Mountcastle#Research_and_career">Mountcastle</a></li></ul>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;Please note that all these ideas may prove to be wrong or will be revised.&lt;/p&gt;
&lt;h2 id=&quot;
      
    
    </summary>
    
      <category term="AI" scheme="https://824zzy.github.io/categories/AI/"/>
    
    
      <category term="agi" scheme="https://824zzy.github.io/tags/agi/"/>
    
  </entry>
  
  <entry>
    <title>零样本学习的视频描述</title>
    <link href="https://824zzy.github.io/2019/04/07/zero-shot-for-VD/"/>
    <id>https://824zzy.github.io/2019/04/07/zero-shot-for-VD/</id>
    <published>2019-04-07T22:15:30.000Z</published>
    <updated>2019-04-07T09:32:28.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Hand-writing-pdf-version"><a href="#Hand-writing-pdf-version" class="headerlink" title="Hand-writing pdf version"></a>Hand-writing pdf version</h2><div class="row"><iframe src="https://drive.google.com/file/d/1XtGej5wnl5hiJebI38wYpQrI6TIjndSs/preview" style="width:100%; height:550px"></iframe></div><h2 id="Hand-writing-image-version"><a href="#Hand-writing-image-version" class="headerlink" title="Hand-writing image version"></a>Hand-writing image version</h2><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g1u7a6lduaj20zf19u1d4.jpg" alt=""><br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g1u7aj7ktxj20zf19uk31.jpg" alt=""><br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g1u7as81fnj20zf19utfw.jpg" alt=""></p>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;Hand-writing-pdf-version&quot;&gt;&lt;a href=&quot;#Hand-writing-pdf-version&quot; class=&quot;headerlink&quot; t
      
    
    </summary>
    
      <category term="VideoCaptioning" scheme="https://824zzy.github.io/categories/VideoCaptioning/"/>
    
    
      <category term="zeroshot" scheme="https://824zzy.github.io/tags/zeroshot/"/>
    
  </entry>
  
  <entry>
    <title>基于分层强化学习的视频描述</title>
    <link href="https://824zzy.github.io/2019/03/10/HRL-video-captioning/"/>
    <id>https://824zzy.github.io/2019/03/10/HRL-video-captioning/</id>
    <published>2019-03-10T16:49:46.000Z</published>
    <updated>2019-03-10T11:09:13.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="论文基本信息"><a href="#论文基本信息" class="headerlink" title="论文基本信息"></a>论文基本信息</h2><ol><li><p>论文名：Video Captioning via Hierarchical Reinforcement Learning</p></li><li><p>论文链接：<a href="https://ieeexplore.ieee.org/document/8578541/">https://ieeexplore.ieee.org/document/8578541/</a></p></li><li><p>论文源码：</p><ul><li>None</li></ul></li><li><p>关于笔记作者：</p><ul><li>朱正源,北京邮电大学研究生，研究方向为多模态与认知计算。  </li></ul></li></ol><hr><h2 id="论文推荐理由"><a href="#论文推荐理由" class="headerlink" title="论文推荐理由"></a>论文推荐理由</h2><p>视频描述中细粒度的动作描述仍然是该领域中一个巨大的挑战。该论文创新点分为两部分：1. 通过层级化的强化学习框架，使用高层manager识别粗粒度的视频信息并控制描述生成的目标，使用低层的worker识别细粒度的动作并完成目标。2. 提出Charades数据集。</p><hr><h1 id="Video-Captioning-via-Hierarchical-Reinforcement-Learning"><a href="#Video-Captioning-via-Hierarchical-Reinforcement-Learning" class="headerlink" title="Video Captioning via Hierarchical Reinforcement Learning"></a>Video Captioning via Hierarchical Reinforcement Learning</h1><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g0xt4i7vsnj20qs0k47lv.jpg" alt=""></p><h2 id="Framework-of-model"><a href="#Framework-of-model" class="headerlink" title="Framework of model"></a>Framework of model</h2><ol><li><p>Work processing</p><ul><li><p><strong>Pretrained CNN</strong> encoding stage we obtain:<br>video frame features: $v={v_i}$, where $i$ is index of frames.</p></li><li><p>Language Model encoding stage we obtain:<br>Worker : $h^{E_w}={h_i^{E_w}}$ from low-level <strong>Bi-LSTM</strong> encoder<br>Manager: $h^{E_m}={h_i^{E_m}}$ from high <strong>LSTM</strong> encoder</p></li><li><p>HRL agent decoding stage we obtain:<br>Language description:$a<em>{1}a</em>{2}…a_{T}$, where $T$ is the length of generated caption.</p></li></ul></li><li><p>Details in HRL agent:</p><ol><li>High-level manager:<ul><li>Operate at lower temporal resolution.</li><li>Emits a goal for worker to accomplish.</li></ul></li><li>Low-level worker<ul><li>Generate a word for each time step by following the goal.</li></ul></li><li>Internal critic <ul><li>Determin if the worker has accomplished the goal</li></ul></li></ol></li><li><p>Details in Policy Network:</p><ol><li>Attention Module:<ol><li>At each time step t: $c<em>t^W=\sum\alpha</em>{t,i}^{W}h^{E_w}_i$</li><li>Note that attention score $\alpha<em>{t,i}^{W}=\frac{exp(e</em>{t, i})}{\sum_{k=1}^{n}exp(e<em>t, k)}$, where $e</em>{t,i}=w^{T} tanh(W<em>{a} h</em>{i}^{E<em>w} + U</em>{a} h^{W}_{t-1})$</li></ol></li><li>Manager and Worker:<ol><li>Manage: take $[c_t^M, h_t^M]$ as input to produce goal. Goal is obtained through a MLP.</li><li>Worker: receive the goal $g_t$ and take the concatenation of $c_t^W, g<em>t, a</em>{t-1}$ as input, and outputs the probabilities of $\pi_t$ over all action $a_t$.</li></ol></li><li>Internal Critic:<ol><li>evaluate worker’s progress. Using an RNN struture takes a word sequence as input to discriminate whether end.</li><li>Internal Critic RNN take $h^I_{t-1}, a_t$ as input, and generate probability $p(z_t)$.</li></ol></li></ol></li><li><p>Details in Learning:</p><ol><li>Definition of Reward:<br>$R(a<em>t)$ = $\sum</em>{k=0} \gamma^{k} f(a_{t+k})$ , where　 $f(x)=CIDEr(sent+x)-CIDEr(sent)$ and $sent$ is previous generated caption.</li><li>Pseudo Code of HRL training algorithm:<pre><code class="py"><span class="keyword">import</span> training_pairs<span class="keyword">import</span> pretrained_CNN, internal_critic<span class="keyword">for</span> i <span class="keyword">in</span> range(M):Initial_random(minibatch)<span class="keyword">if</span> Train_Worker:  goal_exploration(enable=<span class="literal">False</span>)  sampled_capt = LSTM() <span class="comment"># a_1, a_2, ..., a_T</span>  Reward = [r_i <span class="keyword">for</span> r_i <span class="keyword">in</span> calculate_R(sampled_caption)]  Manager(enable=<span class="literal">False</span>)  worker_policy = Policy_gradient(Reward)<span class="keyword">elif</span> Train_Manager:  Initial_ramdom_process(N)  greedy_decoded_cap = LSTM()  Reward = [r_i <span class="keyword">for</span> r_i <span class="keyword">in</span> calculate_R(sampled_caption)]  Worker(enable=<span class="literal">False</span>)  manager_policy = Policy_gradient(Reward)</code></pre></li></ol></li></ol><h2 id="All-in-one"><a href="#All-in-one" class="headerlink" title="All in one"></a>All in one</h2><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g0xt54v9puj21ao0p27c8.jpg" alt=""></p><h2 id="数据集"><a href="#数据集" class="headerlink" title="数据集"></a>数据集</h2><ol><li><a href="http://ms-multimedia-challenge.com/2017/challenge">MSR-VTT</a><blockquote><p>该数据集包含50个小时的视频和26万个相关视频描述。</p></blockquote></li></ol><ol><li><a href="https://mila.quebec/en/publications/public-datasets/m-vad/">Charades</a><blockquote><p>Charades Captions:室内互动的9848个视频，包含157个动作的66500个注解，46个类别的物体的41104个标签，和共27847个文本描述。</p></blockquote></li></ol><h2 id="实验结果"><a href="#实验结果" class="headerlink" title="实验结果"></a>实验结果</h2><ol><li><p>实验可视化<br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g0xs2qfw1rj220k0hce5u.jpg" alt=""></p></li><li><p>模型对比<br><img src="https://ws1.sinaimg.cn/mw690/ca26ff18gy1g0xs1f57tkj21120hwgpl.jpg" alt=""></p></li></ol>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;论文基本信息&quot;&gt;&lt;a href=&quot;#论文基本信息&quot; class=&quot;headerlink&quot; title=&quot;论文基本信息&quot;&gt;&lt;/a&gt;论文基本信息&lt;/h2&gt;&lt;ol&gt;
&lt;l
      
    
    </summary>
    
      <category term="VideoCaptioning" scheme="https://824zzy.github.io/categories/VideoCaptioning/"/>
    
    
      <category term="reinforcement_learning" scheme="https://824zzy.github.io/tags/reinforcement-learning/"/>
    
  </entry>
  
  <entry>
    <title>HTM_theory</title>
    <link href="https://824zzy.github.io/2019/03/03/HTM-theory/"/>
    <id>https://824zzy.github.io/2019/03/03/HTM-theory/</id>
    <published>2019-03-03T18:08:50.000Z</published>
    <updated>2019-04-07T09:15:05.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><ol><li>Overview<br>cellular structure are same.(bio pic)</li></ol><p>hierarchical structure.(need a pic)and each region is performing the same set of processes on the input data.</p><p>[SDR] sparse distributed representations(0 or 1)</p><p>Input: 1. Motor commands 2. sensory input</p><p>Encoder: takes a datatype and converts it into a sparse distributed representations. </p><p>Temporal means that systems learn continuously, every time it receives input it is attemptiing to predict what is going to happen next.</p><ol><li>Sparse Distributed Representation(SDR)<br>Terms: 1. n=Bit array length 2. w=Bits of array 3. sparsiity 4. Dense Bit Array Capacity=2**of bits<br>Bit array</li></ol><p>Capacity=n!/w!(n-w)!<br><img src="https://ws1.sinaimg.cn/mw690/ca26ff18gy1g0pke65e1dj218g11awlp.jpg" alt=""></p><p>OVERLAP/UNION<br>Similarity can be represented by overlap/union? of SDR.</p><p>MATCH</p><ol><li>Overlap Sets and Subsampling</li><li><p>Scalar Encoder(retina/cochlea)</p><ul><li>Scalar Encoder: consecutive one</li><li>Random Distributed Scalar Encoder: random one</li></ul></li><li><p>Data-time encoder </p></li><li>Input Space&amp; Connections<ul><li>Spactial Pooler: maintain a fixed sparsity &amp; maintain a overlap properties.</li></ul></li></ol>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;ol&gt;
&lt;li&gt;Overview&lt;br&gt;cellular structure are same.(bio pic)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;hierarchical stru
      
    
    </summary>
    
    
  </entry>
  
  <entry>
    <title>基于时序结构的视频描述</title>
    <link href="https://824zzy.github.io/2019/03/02/describing-videos-by-exploiting-tempporal-structure/"/>
    <id>https://824zzy.github.io/2019/03/02/describing-videos-by-exploiting-tempporal-structure/</id>
    <published>2019-03-02T17:49:46.000Z</published>
    <updated>2019-03-02T13:53:13.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="论文基本信息"><a href="#论文基本信息" class="headerlink" title="论文基本信息"></a>论文基本信息</h2><ol><li><p>论文名：Describing Videos by Exploiting Temporal Structure</p></li><li><p>论文链接：<a href="https://arxiv.org/pdf/1502.08029">https://arxiv.org/pdf/1502.08029</a></p></li><li><p>论文源码：</p><ul><li><a href="https://github.com/tsenghungchen/SA-tensorflow">https://github.com/tsenghungchen/SA-tensorflow</a></li></ul></li><li><p>关于笔记作者：</p><ul><li>朱正源,北京邮电大学研究生，研究方向为多模态与认知计算。  </li></ul></li></ol><hr><h2 id="论文推荐理由"><a href="#论文推荐理由" class="headerlink" title="论文推荐理由"></a>论文推荐理由</h2><p>本文是蒙特利尔大学发表在ICCV2015的研究成果，其主要创新点在于提出了时序结构并且利用注意力机制达到了在2015年的SOTA。通过3D-CNN捕捉视频局部信息和注意力机制捕捉全局信息相结合，可以全面提升模型效果。<br>其另一个重要成果是MVAD电影片段描述数据集，此<a href="https://mila.quebec/en/publications/public-datasets/m-vad/">数据集</a>已经成为了当前视频描述领域主流的数据集。</p><hr><h2 id="Describing-Videos-by-Exploiting-Temporal-Structure"><a href="#Describing-Videos-by-Exploiting-Temporal-Structure" class="headerlink" title="Describing Videos by Exploiting Temporal Structure"></a>Describing Videos by Exploiting Temporal Structure</h2><h3 id="视频描述任务介绍："><a href="#视频描述任务介绍：" class="headerlink" title="视频描述任务介绍："></a>视频描述任务介绍：</h3><p>根据视频生成单句的描述，一例胜千言：</p><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzcxfmvyxuj20si0hqqfg.jpg" alt=""></p><p>　　A monkey pulls a dog’s tail and is chased by the dog.</p><p>2015年较早的模型：<br><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzcwjf53bsj214x0ksajx.jpg" alt="LSTM-YT模型"></p><h3 id="2015年之前的模型存在的问题"><a href="#2015年之前的模型存在的问题" class="headerlink" title="2015年之前的模型存在的问题"></a>2015年之前的模型存在的问题</h3><ol><li>输出的描述没有考虑到动态的<strong>时序结构</strong>。</li><li>之前的模型利用一个特征向量来表示视频中的所有帧，导致无法识别视频中物体出现的<strong>先后顺序</strong>。</li></ol><h3 id="论文思路以及创新点"><a href="#论文思路以及创新点" class="headerlink" title="论文思路以及创新点"></a>论文思路以及创新点</h3><ol><li>通过局部和全局的时序结构来产生视频描述：</li></ol><p><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0odfi9c82j20zk0eih1g.jpg" alt=""></p><p>  针对Decoder生成的每一个单词，模型都会关注视频中特定的某一帧。</p><ol><li>使用3-D CNN来捕捉视频中的动态时序特征。</li></ol><h3 id="模型结构设计"><a href="#模型结构设计" class="headerlink" title="模型结构设计"></a>模型结构设计</h3><p><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0oe02i6tjj21c80k47e5.jpg" alt=""></p><ul><li>Encoder(3-D CNN + 2-D GoogLeNet)的设置：3 * 3 * 3 的三维卷积核，并且是3-D CNN在行为识别数据集上预训练好的。</li></ul><p>每个卷积层后衔接ReLu激活函数和Local max-pooling， dropout参数设置为0.5。</p><p><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0oe2iz7iaj20v20ien2y.jpg" alt=""></p><ul><li>Decoder(LSTM)的设置：使用了additive attention作为注意力机制，下图为在两个数据集上的超参数设置：<br><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0osevs2qoj21620pwafz.jpg" alt=""></li></ul><h3 id="实验细节"><a href="#实验细节" class="headerlink" title="实验细节"></a>实验细节</h3><h4 id="数据集"><a href="#数据集" class="headerlink" title="数据集"></a>数据集</h4><ol><li><a href="http://www.cs.utexas.edu/users/ml/clamp/videoDescription/">Microsoft Research Video Description dataset</a></li></ol><blockquote><p>1970条Youtobe视频片段：每条大约10到30秒，并且只包含了一个活动，其中没有对话。1200条用作训练，100条用作验证，670条用作测试。</p></blockquote><ol><li><a href="https://mila.quebec/en/publications/public-datasets/m-vad/">Montreal Video Annotation Dataset</a></li></ol><blockquote><p>数据集包含从92部电影的49000个视频片段，并且每个视频片段都被标注了描述语句。</p></blockquote><h4 id="评估指标"><a href="#评估指标" class="headerlink" title="评估指标"></a>评估指标</h4><ul><li>BLEU</li></ul><p><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0os1pgs0mj20qe0kgqh7.jpg" alt=""></p><ul><li>METEOR</li></ul><p><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0os2ibx04j20su0mgh2g.jpg" alt=""></p><ul><li>CIDER</li><li>Perplexity</li></ul><h4 id="实验结果"><a href="#实验结果" class="headerlink" title="实验结果"></a>实验结果</h4><ol><li>实验可视化<br><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0ornzkuylj220a144u0x.jpg" alt="实验结果"></li></ol><p>柱状图表示每一帧生成对应颜色每个单词时的注意力权重。</p><ol><li>模型对比<br><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0ormgxp41j22120ggten.jpg" alt="模型对比"></li></ol><h3 id="引用与参考"><a href="#引用与参考" class="headerlink" title="引用与参考"></a>引用与参考</h3><ul><li><a href="https://arxiv.org/pdf/1502.08029">Describing Videos by Exploiting Temporal Structure</a></li></ul>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;论文基本信息&quot;&gt;&lt;a href=&quot;#论文基本信息&quot; class=&quot;headerlink&quot; title=&quot;论文基本信息&quot;&gt;&lt;/a&gt;论文基本信息&lt;/h2&gt;&lt;ol&gt;
&lt;l
      
    
    </summary>
    
      <category term="VideoCaptioning" scheme="https://824zzy.github.io/categories/VideoCaptioning/"/>
    
    
      <category term="videoCaptioning" scheme="https://824zzy.github.io/tags/videoCaptioning/"/>
    
  </entry>
  
  <entry>
    <title>视频描述领域的第一篇深度模型论文</title>
    <link href="https://824zzy.github.io/2019/01/19/first-deep-model-in-video-captioning/"/>
    <id>https://824zzy.github.io/2019/01/19/first-deep-model-in-video-captioning/</id>
    <published>2019-01-20T04:00:00.000Z</published>
    <updated>2019-01-20T09:04:38.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="论文基本信息"><a href="#论文基本信息" class="headerlink" title="论文基本信息"></a>论文基本信息</h2><ol><li><p>论文名：Translating Videos to Natural Language Using Deep Recurrent Neural Networks</p></li><li><p>论文链接：<a href="https://www.cs.utexas.edu/users/ml/papers/venugopalan.naacl15.pdf">https://www.cs.utexas.edu/users/ml/papers/venugopalan.naacl15.pdf</a></p></li><li><p>论文源码：</p><ul><li><a href="https://github.com/vsubhashini/caffe/tree/recurrent/examples/youtube">https://github.com/vsubhashini/caffe/tree/recurrent/examples/youtube </a></li></ul></li><li><p>关于笔记作者：</p><ul><li>朱正源,北京邮电大学研究生，研究方向为多模态与认知计算。  </li></ul></li></ol><hr><h2 id="论文推荐理由"><a href="#论文推荐理由" class="headerlink" title="论文推荐理由"></a>论文推荐理由</h2><p>假设我们在未来已经实现了通用人工智能，当我们回首向过去看，到底哪个时代会被投票选为最重要的“Aha Moment”呢？</p><p>作为没有预知未来能力的普通人。为了回答这个问题，首先需要明确的一点就是：我们现在究竟处在实现通用人工智能之前的哪个位置？</p><p>一个常用的比喻便是，如果把从开始尝试到最终实现通用人工智能比作一条一公里的公路的话。大部分人可能会认为我们已经走了200米到500米之间。但是真实的情况可能是，我们仅仅走过了5厘米不到。</p><p>因为在通往正确道路的各种尝试中，有很大一部分会犯方向性错误。当我们在错误的道路上越走越远的时候，那么肯定无法到达终点。推倒现有成果重新来过便是不可避免的。我们需要时时刻刻保持谨小慎微，以躲避“岔路口”。</p><p>现在有理由相信（其实是因为不得不掩耳盗铃），我们正走在一条正确的道路上。如果非要说现在的技术有哪些让我感觉不那么符合我的直觉的地方的话，我肯定会抢着回答：We are not living in the books or images.</p><p>公元前五亿年前，当我们还是扁形虫的时候，那时候我们便会在未知的环境中为了生存下去作出连续的决策。</p><p>公元前两亿年前，我们进化成啮齿类动物，并且拥有了一套完整的操作系统。不变的是，不断连续变化的生存环境。</p><p>公元前四百万年前，原始人类进化出了大脑皮层之后，终于拥有了进行推理和思考的能力。但是这一切是在他们发明文字和语言之前。</p><p>现如今，当人类巨灵正在尝试创造出超越本身智能的超智能体时，却神奇的忽略了超智能体也应该生存在不断变化的、充满危险的世界之中。</p><p>回到最开始的问题，我一定会把票投在利用神经模型来处理视频流的模型上。</p><hr><h2 id="Translating-Videos-to-Natural-Language-Using-Deep-Recurrent-Neural-Networks"><a href="#Translating-Videos-to-Natural-Language-Using-Deep-Recurrent-Neural-Networks" class="headerlink" title="Translating Videos to Natural Language Using Deep Recurrent Neural Networks"></a>Translating Videos to Natural Language Using Deep Recurrent Neural Networks</h2><h3 id="视频描述任务介绍："><a href="#视频描述任务介绍：" class="headerlink" title="视频描述任务介绍："></a>视频描述任务介绍：</h3><p>根据视频生成单句的描述，一例胜千言：</p><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzcxfmvyxuj20si0hqqfg.jpg" alt=""></p><p>　　A monkey pulls a dog’s tail and is chased by the dog.</p><h3 id="视频描述的前世："><a href="#视频描述的前世：" class="headerlink" title="视频描述的前世："></a>视频描述的前世：</h3><p>管道方法（PipeLine Approach）</p><ol><li>从视频中识别出<code>主语</code>、<code>动作</code>、<code>宾语</code>、<code>场景</code></li><li>计算被识别出实体的置信度</li><li>根据最高置信度的实体与预先设置好的模板，进行句子生成</li></ol><p>　在神经模型风靡之前，传统方法集中使用<strong>隐马尔科夫模型识别实体</strong>和<strong>条件随机场生成句子</strong></p><h3 id="神经模型的第一次尝试："><a href="#神经模型的第一次尝试：" class="headerlink" title="神经模型的第一次尝试："></a>神经模型的第一次尝试：</h3><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzcwjf53bsj214x0ksajx.jpg" alt="LSTM-YT模型"></p><ol><li>从视频中，每十帧取出一帧进行分析</li></ol><p>　<img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzczmke1dbj20vg0astdq.jpg" alt=""><br>　人类眼睛的帧数是每秒24帧，从仿生学的观点出发，模型也不需要处理视频中所有的帧。再对视频帧进行缩放以便计算机进行处理。</p><ol><li>使用CNN提取特征并进行平均池化（Mean Pooling）</li></ol><p>　<img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzczv0iwy6j20x40mktg0.jpg" alt=""></p><ul><li><p>预训练的Alexnet[2012]:在120万张图片上进行预训练[ImageNet LSVRC-2012]，提取最后一层（第七层全连接层）的特征（4096维）。注意：提取的向量不是最后进行分类的1000维特征向量。</p><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzd0iak8vhj216o0eqacn.jpg" alt="Alexnet"></p></li><li><p>对所有的视频帧进行池化</p></li></ul><ol><li>句子生成</li></ol><p>　<img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzd06rquyyj20qk0nwgo6.jpg" alt="RNN生成句子"></p><h3 id="迁移学习和微调模型"><a href="#迁移学习和微调模型" class="headerlink" title="迁移学习和微调模型"></a>迁移学习和微调模型</h3><ol><li>在图片描述任务进行预训练</li></ol><p>　<img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzd0skao7sj216c0jm0zn.jpg" alt="transfer-learning from image captioning"></p><ol><li>微调（Fine-tuning）<br>　需要注意的是，在视频描述过程中：<ul><li>将输入从图片转换为视频；</li><li>添加了平均池化特征这个技巧；</li><li>模型进行训练的时候使用了更低的学习率</li></ul></li></ol><h3 id="实验细节"><a href="#实验细节" class="headerlink" title="实验细节"></a>实验细节</h3><h4 id="数据集"><a href="#数据集" class="headerlink" title="数据集"></a>数据集</h4><ol><li><a href="http://www.cs.utexas.edu/users/ml/clamp/videoDescription/">Microsoft Research Video Description dataset</a></li></ol><blockquote><p>1970条Youtobe视频片段：每条大约10到30秒，并且只包含了一个活动，其中没有对话。1200条用作训练，100条用作验证，670条用作测试。</p></blockquote><p><img src="https://ws1.sinaimg.cn/mw690/ca26ff18ly1fzd1cmxyalj217g0lcdyf.jpg" alt="dataset"></p><ol><li><a href="https://blog.csdn.net/daniaokuye/article/details/78699138">MSCOCO数据集下载</a></li><li><a href="https://blog.csdn.net/gaoyueace/article/details/80564642">Flickr30k数据集下载</a></li></ol><h4 id="评估指标"><a href="#评估指标" class="headerlink" title="评估指标"></a>评估指标</h4><ul><li>SVO(Subject, Verb, Object accuracy)</li><li>BLEU</li><li>METEOR</li><li>Human evaluation</li></ul><h4 id="实验结果"><a href="#实验结果" class="headerlink" title="实验结果"></a>实验结果</h4><ol><li>SVO正确率：</li></ol><p>　<img src="https://ws1.sinaimg.cn/mw690/ca26ff18ly1fzd5gnjzg6j20p20f8n0d.jpg" alt="result on SVO"></p><ol><li>BLEU值和METEOR值</li></ol><p>　<img src="https://ws1.sinaimg.cn/mw690/ca26ff18ly1fzd5p0xo78j20nm0aotak.jpg" alt="result on BLEU and METEOR"></p><h3 id="站在2019年回看2015年的论文"><a href="#站在2019年回看2015年的论文" class="headerlink" title="站在2019年回看2015年的论文"></a>站在2019年回看2015年的论文</h3><p>以19年的后见之明来考察这篇论文，虽然论文没有Attention和强化学习加持的，但是也开辟了用神经模型完成视频描述任务的先河。</p><p>回顾一下以前提出的问题，如何才能实现：</p><ol><li>常识推理。</li><li>空间位置。</li><li>根据不同粒度回复问题。</li></ol><p>答案很有可能在我们身上，大脑皮质中的前额皮质掌管着人格（就是你脑中出现的那个声音，就是他）。大脑皮质虽然仅仅是大脑最外层的两毫米厚的薄薄一层（<a href="https://zh.wikipedia.org/wiki/%E5%A4%A7%E8%84%91%E7%9A%AE%E8%B4%A8">没错，我确定就是两毫米</a>）,但是它起到的作用却是史无前例的。</p><p>以大脑皮质作为启发，最少我们也需要让人工大脑皮质也“生存”在一个类似于现实世界中的环境当中。因此视频是一个很好的起点，但也仅仅是个起点。</p><h3 id="引用与参考"><a href="#引用与参考" class="headerlink" title="引用与参考"></a>引用与参考</h3><ul><li><a href="https://waitbutwhy.com/2017/04/neuralink.html">Neuralink and the Brain’s Magical Future</a></li><li><a href="https://www.cs.utexas.edu/~vsub/pdf/Translating_Videos_slides.pdf">Translating Videos to Natural Language Using Deep Recurrent Neural Networks – Slides</a></li><li><a href="https://medium.com/@smallfishbigsea/a-walk-through-of-alexnet-6cbd137a5637">A Walk-through of AlexNet</a></li><li><a href="https://www.cs.utexas.edu/users/ml/clamp/videoDescription/#data">Collecting Multilingual Parallel Video Descriptions Using Mechanical Turk</a></li><li><a href="https://zh.wikipedia.org/wiki/%E5%A4%A7%E8%84%91%E7%9A%AE%E8%B4%A8">大脑皮质</a></li></ul>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;论文基本信息&quot;&gt;&lt;a href=&quot;#论文基本信息&quot; class=&quot;headerlink&quot; title=&quot;论文基本信息&quot;&gt;&lt;/a&gt;论文基本信息&lt;/h2&gt;&lt;ol&gt;
&lt;l
      
    
    </summary>
    
      <category term="AGI" scheme="https://824zzy.github.io/categories/AGI/"/>
    
    
      <category term="videoCaptioning" scheme="https://824zzy.github.io/tags/videoCaptioning/"/>
    
  </entry>
  
  <entry>
    <title>Tips for Examination of Network Software Design</title>
    <link href="https://824zzy.github.io/2018/12/26/Network-software-design-exam-tips/"/>
    <id>https://824zzy.github.io/2018/12/26/Network-software-design-exam-tips/</id>
    <published>2018-12-26T15:08:17.000Z</published>
    <updated>2019-01-04T13:21:04.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Question-distribution"><a href="#Question-distribution" class="headerlink" title="Question distribution"></a>Question distribution</h2><ul><li>10 choice questions(20%)</li><li>10 true or false questions(20%)</li><li>6 essay questions(60%)</li></ul><h2 id="Examation-contains-three-parts-Network-amp-Design-amp-Programming"><a href="#Examation-contains-three-parts-Network-amp-Design-amp-Programming" class="headerlink" title="Examation contains three parts: Network &amp; Design &amp; Programming"></a>Examation contains three parts: Network &amp; Design &amp; Programming</h2><h3 id="Network"><a href="#Network" class="headerlink" title="Network"></a>Network</h3><ul><li>IP address:<ul><li><code>public address</code>: an IP address that can be <strong>accessed over the Internet</strong>. And your public IP address is the <strong>globally unique</strong> and <strong>can be found</strong>, and can only be assigned to a unique device.</li><li><code>private IP address</code>: The devices with private IP address will <strong>use your router’s public IP address to communicate</strong>. Note that to allow direct access to a local device which is assign a private IP address, a Network Address Translator(NAT) should be used.</li><li><code>how to compute total ip address in a subnet</code>: <ol><li>transform ip address into binary address.</li><li>count zero from tail to first one.</li><li>subtract 2(reserve address and broadcast address)</li></ol></li></ul></li></ul><hr><ul><li>Port(some default ports for common protocol):<ul><li><strong>http</strong>: 80</li><li><strong>https</strong>: 443</li><li><strong>ntp</strong>: 123</li><li><strong>ssh/tcp</strong>: 22</li><li><strong>mongoDB</strong>: 27017</li><li>DNS: 53</li><li>FTP: 21</li><li>Telnet: 23</li></ul></li></ul><hr><ul><li><p>DNS(duplicated):The Domain Name System(DNS) is the <strong>phonebook</strong> of the Internet. </p><ul><li>DNS <strong>translate domain names to IP address</strong> so browsers can load Internet resources.</li><li>DNS is hierarchical with a few authoritative serves at the top level.<ol><li>Your router or ISP provides information about DNS server to contact when doing a look up.</li><li>Low level DNS servers cache mappings, which could become stale due to DNS propagation delays. </li><li>DNS results can also be cached by your browser or OS for a certain period of time, determined by the time to live(TTL)</li></ol></li></ul></li></ul><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fynty6a1xwj20e50vnjt4.jpg" alt=""></p><hr><ul><li>CDN(duplicated):<ul><li>Definition: <strong>CDN(Content dilivery network/Content distributed network) is a geographically distributed network</strong> of proxy servers and their data centers. The goal is to distribute service spatially relative to end-users to <strong>provide high availability and high performance</strong>.</li><li>Improve performance in two ways:<ul><li>Users receive content at <strong>data centers close to them</strong>.</li><li>Your servers do not have to serve requests that the CND fulfills.</li></ul></li></ul></li></ul><hr><ul><li>Main routing protocol:<ul><li><code>OSPF</code>(Open Shortest Path First): OSPF is an <strong>interior gateway protocal</strong>.</li><li><code>IS-IS</code>(Intermediate System-to-Intermediate System): It is just like OSPF. IS-IS associates routers into areas of intra-area and inter-area.</li><li><code>BGP</code>(Border Gateway Protocol): It is used as the edge of your network. BGP <strong>constructs a routing table of networks</strong> reachable among Autonomous Systems(AS) number defined by the user.</li></ul></li></ul><hr><h3 id="Design"><a href="#Design" class="headerlink" title="Design"></a>Design</h3><h2 id="software-requirements-analysis-not-key-point"><a href="#software-requirements-analysis-not-key-point" class="headerlink" title="- software requirements analysis: not key point"></a>- software requirements analysis: not key point</h2><ul><li>main principles and key technologies for High concurrency programming.(duplicated)<ol><li><code>Security</code>, or correctness, is when a program executes concurrently with the expected results<ol><li>The <strong>visibility</strong></li><li>The <strong>order</strong></li><li>The <strong>atomic</strong></li></ol></li><li><code>Activeness</code>: Program must confront to <strong>deadlocks and livelocks</strong></li><li><code>Performance</code>: <strong>Less context switching, less kernel calls, less consistent traffic</strong>, and so on.</li></ol></li></ul><hr><ul><li>generic phases of software engineering<ol><li><code>Requirements analysis</code></li><li><code>Software Design</code></li><li><code>Implementation</code></li><li><code>Verification/Testing</code></li><li><code>Deployment</code></li><li><code>Maintenance</code></li></ol></li></ul><hr><ul><li>Agile Development(duplicated)<ol><li><code>Agile Values</code>:<ol><li>Individuals and <strong>interactions</strong> over processes and tools</li><li>Working software over <strong>comprehensive documentation</strong></li><li><strong>Customer collaboration</strong> over contract negotiation</li><li><strong>Responding to change</strong> over following a plan</li></ol></li><li><code>Agile Methods</code>:<ol><li>Frequently <strong>deliver small incremental</strong> units of functionality</li><li><strong>Define, build, test and evaluate cycles</strong></li><li>Maximize speed of <strong>feedback loop</strong></li></ol></li></ol></li></ul><hr><h3 id="Programming"><a href="#Programming" class="headerlink" title="Programming"></a>Programming</h3><ul><li>MVC(model-view-controller) model:<ul><li>MVC is an <strong>architectural pattern</strong> commonly used for developing <strong>user interfaces</strong> and allowing for effcient <strong>code reuse</strong> and <strong>paraller development</strong>.<ul><li>Model[probe]: an object carrying data. It can also have logic to <strong>update controller</strong> if its data changes.</li><li>View[frontend]: it can be any <strong>output representation of information</strong>, such as chart or a diagram.</li><li>Controller[backend]: accpet input and converts it to commands for the model or view</li></ul></li></ul></li></ul><hr><ul><li>NoSQL database(duplicated):<ol><li>NoSQL is <code>Not only SQL</code>, it has the advantages below: <ul><li><strong>Not using</strong> the <strong>relational model</strong> nor the SQL language. It is a collection of <strong>data items represented in a key-value store, document-store, wide column store, or a graph database</strong>.</li><li>Designed to run on <strong>large clusters</strong></li><li><strong>No schema</strong></li><li>Open Source</li></ul></li><li>NoSQL properties in detail:<ul><li><strong>Flexible scalability</strong></li><li><strong>Dynamic schema</strong> of data</li><li><strong>Efficient reading</strong></li><li><strong>Cost saving</strong></li></ul></li><li>NoSQL Technologies:<ul><li><strong>MapReduce</strong> programming model</li><li><strong>Key-value</strong> stores</li><li><strong>Document databases</strong></li><li><strong>Column-family stores</strong></li><li><strong>Graph databases</strong> </li></ul></li></ol></li></ul><hr><ul><li>Websocket(duplicated):<ul><li>WebSocket is a <strong>computer communications protocal</strong>, providing <strong>Bidirectional full-duplex communication channels</strong> over a single TCP connection and it is defined in <strong>RFC6445</strong>.</li><li>WebSocket is a different protocol from HTTP. Both protocols are located at <strong>layer 7 in the OSI model</strong> and depend on <strong>TCP at layer 4</strong>.</li><li>The WebSocket protocol enables interaction between a web client and a web server with lower overheads, facilitating real-time data transfer from and to the server.</li><li>working progress<ul><li>There are four main functions in Tornado<ul><li><code>open()</code>: Invoked when a new websocket is opened.</li><li><code>on_message(message)</code>: Handle incoming messages on the WebSocket</li><li><code>on_close()</code>: Invoke when the WebSocket is closed.</li><li><code>write_message(message)</code>: Sends the given message to the client of this Web Socket.</li></ul></li></ul></li></ul></li></ul><hr><ul><li>Differences between <code>git</code> and <code>svn</code>:<ol><li>Git is a <strong>distrubuted</strong> version control system; SVN is a non-distributed version control system.</li><li>Git has a <strong>centralized</strong> server and repository; SVN has <strong>non-centralized</strong> server and repository.</li><li>The content in Git is stored as <strong>metadata</strong>; SVN stores <strong>files of content</strong>.</li><li>Git branches are <strong>easier</strong> to work with than SVN branches.</li><li>Git does not have the <strong>global revision number</strong> feature like SVN has.</li><li>Git has <strong>better content protection</strong> than SVN,</li><li>Git was developed for <strong>Linux kernel</strong> by Linus Torvalds; SVN was deveploped by <strong>CollabNet</strong>.</li></ol></li></ul><h2 id="Essay-Questions"><a href="#Essay-Questions" class="headerlink" title="Essay Questions"></a>Essay Questions</h2><h3 id="Main-role-of-IP-address-port-DNS-CDN-for-network-software-design"><a href="#Main-role-of-IP-address-port-DNS-CDN-for-network-software-design" class="headerlink" title="Main role of IP address, port, DNS, CDN for network software design"></a>Main role of IP address, port, DNS, CDN for network software design</h3><ol><li>An internet Protocal address(IP address) is <strong>a numerical label</strong> assigned to each device connected to a computer network that <strong>uses the Internet Protocal for communication</strong>.</li><li>In computer networking, <strong>a port is an endpoint of communication</strong> and <strong>a logical construct that identifies a specific process</strong> or a type of network device.</li><li><strong>DNS(Domain Name System)</strong> is a <strong>hierarchical decentralized naming system</strong> for computers connected to the Internet or a private network,</li><li><strong>CDN(Content dilivery network/Content distributed network) is a geographically distributed network</strong> of proxy servers and their data centers. The goal is to distribute service spatially relative to end-users to <strong>provide high availability and high performance</strong>.</li></ol><h3 id="Difference-between-git-and-svn"><a href="#Difference-between-git-and-svn" class="headerlink" title="Difference between git and svn:"></a>Difference between <strong>git</strong> and <strong>svn</strong>:</h3><ol><li>Git is a <strong>distrubuted</strong> version control system; SVN is a non-distributed version control system.</li><li>Git has a <strong>centralized</strong> server and repository; SVN has <strong>non-centralized</strong> server and repository.</li><li>The content in Git is stored as <strong>metadata</strong>; SVN stores <strong>files of content</strong>.</li><li>Git branches are <strong>easier</strong> to work with than SVN branches.</li><li>Git does not have the <strong>global revision number</strong> feature like SVN has.</li><li>Git has <strong>better content protection</strong> than SVN,</li><li>Git was developed for <strong>Linux kernel</strong> by Linus Torvalds; SVN was deveploped by <strong>CollabNet</strong>.</li></ol><h3 id="main-principles-and-key-technologies-for-High-concurrency-programming"><a href="#main-principles-and-key-technologies-for-High-concurrency-programming" class="headerlink" title="main principles and key technologies for High concurrency programming"></a>main principles and key technologies for High concurrency programming</h3><ol><li><code>Security</code>, or correctness, is when a program executes concurrently with the expected results<ol><li>The <strong>visibility</strong></li><li>The <strong>order</strong></li><li>The <strong>atomic</strong></li></ol></li><li><code>Activeness</code>: Program must confront to <strong>deadlocks and livelocks</strong></li><li><code>Performance</code>: <strong>Less context switching, less kernel calls, less consistent traffic</strong>, and so on.</li></ol><h3 id="NoSQL-and-SQL-database"><a href="#NoSQL-and-SQL-database" class="headerlink" title="NoSQL and SQL database"></a>NoSQL and SQL database</h3><ol><li>RDBMS(Relational database management system)<ol><li>A relational database like SQL is a collection of <strong>data items organized by tables</strong>. It has features below:<ol><li><code>ACID</code> is a set of <strong>properties of relational database transactions</strong>.</li><li><code>Atomicity</code>: Each transaction is all or nothing</li><li><code>Consistency</code>: Any transaction will bring the database from one valid state to server.</li><li><code>Isolation</code>:  Executing transaction has been committed, it will remain so.</li></ol></li></ol></li><li><p>NoSQL</p><ol><li>NoSQL is <code>Not only SQL</code>, it has the advantages below: <ul><li><strong>Not using</strong> the <strong>relational model</strong> nor the SQL language. It is a collection of <strong>data items represented in a key-value store, document-store, wide column store, or a graph database</strong>.</li><li>Designed to run on <strong>large clusters</strong></li><li><strong>No schema</strong></li><li>Open Source</li></ul></li><li>NoSQL properties in detail:<ul><li><strong>Flexible scalability</strong></li><li><strong>Dynamic schema</strong> of data</li><li><strong>Efficient reading</strong></li><li><strong>Cost saving</strong></li></ul></li><li>NoSQL Technologies:<ul><li><strong>MapReduce</strong> programming model</li><li><strong>Key-value</strong> stores</li><li><strong>Document databases</strong></li><li><strong>Column-family stores</strong></li><li><strong>Graph databases</strong> </li></ul></li></ol></li><li><p>SQL VS NoSQL</p><ol><li>Relational data model VS Document data model</li><li>Structured data VS semi-structured data</li><li>strict schema VS dynamic/flexible schema</li><li>relational data VS Non-relational data</li></ol></li></ol><h3 id="Websocket-working-progress"><a href="#Websocket-working-progress" class="headerlink" title="Websocket(working progress)"></a>Websocket(working progress)</h3><ul><li>WebSocket is a <strong>computer communications protocal</strong>, providing <strong>Bidirectional full-duplex communication channels</strong> over a single TCP connection and it is defined in <strong>RFC6445</strong>.</li><li>WebSocket is a different protocol from HTTP. Both protocols are located at <strong>layer 7 in the OSI model</strong> and depend on <strong>TCP at layer 4</strong>.</li><li>The WebSocket protocol enables interaction between a web client and a web server with lower overheads, facilitating real-time data transfer from and to the server.</li><li>working progress<ul><li>There are four main functions in Tornado<ul><li><code>open()</code>: Invoked when a new websocket is opened.</li><li><code>on_message(message)</code>: Handle incoming messages on the WebSocket</li><li><code>on_close()</code>: Invoke when the WebSocket is closed.</li><li><code>write_message(message)</code>: Sends the given message to the client of this Web Socket.</li></ul></li></ul></li></ul><h3 id="Agile-Development-and-scrum"><a href="#Agile-Development-and-scrum" class="headerlink" title="Agile Development and scrum"></a>Agile Development and scrum</h3><ol><li>Agile Values:<ol><li>Individuals and interactions over processes and tools</li><li>Working software over comprehensive documentation</li><li>Customer collaboration over contract negotiation</li><li>Responding to change over following a plan</li></ol></li><li><p>Agile Methods:</p><ol><li>Frequently deliver small incremental units of functionality</li><li>Define, build, test and evaluate cycles</li><li>Maximize speed of feedback loop</li></ol></li><li><p>Scrum is 3 roles:</p><ol><li>Development Team</li><li>Product Owner</li><li>Scrum Master</li></ol></li><li><p>Scrum is 4 events:</p><ol><li>Sprint Planning</li><li>Daily Stand-up Meeting</li><li>Sprint Review</li><li>Sprint Retrospective</li></ol></li><li><p>Scrum is 4 artifacts:</p><ol><li>Product Backlog</li><li>Sprint Backlog</li><li>User Stories</li><li>Scrum Board</li></ol></li></ol><h3 id="Explain-how-DNS-work"><a href="#Explain-how-DNS-work" class="headerlink" title="Explain how DNS work"></a>Explain how DNS work</h3><p>Definition: The Domain Name System(DNS) is the <strong>phonebook</strong> of the Internet. </p><ul><li>DNS <strong>translate domain names to IP address</strong> so browsers can load Internet resources.</li><li>DNS is hierarchical with a few authoritative serves at the top level.<ol><li>Your router or ISP provides information about DNS server to contact when doing a look up.</li><li>Low level DNS servers cache mappings, which could become stale due to DNS propagation delays. </li><li>DNS results can also be cached by your browser or OS for a certain period of time, determined by the time to live(TTL)</li></ol></li></ul><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fynty6a1xwj20e50vnjt4.jpg" alt=""></p><h3 id="Difference-between-docker-and-virtual-host"><a href="#Difference-between-docker-and-virtual-host" class="headerlink" title="Difference between docker and virtual host"></a>Difference between docker and virtual host</h3><ol><li>Virtual Machine definition: Virtualization is the technique of importing a Guest operating system <strong>on top of a Host operating system</strong>.</li><li>Docker definition: A container image is <strong>a lightweight, stand-alone, executable package of a piece of software</strong> that includes everything needed to run it.</li><li>Docker is the service to run <strong>multiple containers on a machine</strong> (node) which can be on a vitual machine or on a physical machine.</li><li>A virtual machine is an <strong>entire operating system</strong> (which normally is not lightweight).</li></ol><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fymtosr7vyj20vt0f0whj.jpg" alt="Difference between docker and virtual machine"></p><h3 id="Type-of-software-test"><a href="#Type-of-software-test" class="headerlink" title="Type of software test"></a>Type of software test</h3><ul><li><p>Black Box testing: Black box testing is a software testing method where testers <strong>are not required to know coding or internal structure</strong> of the software. Black box testing method relies on testing software with various inputs and validating results against expected output.</p></li><li><p>White Box testing: White box testing strategy deals with the <strong>internal logic and structure of the code</strong>. The tests written based on the white box testing strategy incorporate coverage of the code written, branches, paths, statements and internal logic of the code etc.</p></li><li><p>Equivalence Partitioning:Equivalence Partitioning is also known as Equivalence Class Partitioning is a software testing technique and not a type of testing by itself. Equivalence partitioning technique is <strong>used in black box and gray box testing types</strong>. Equivalence partitioning <strong>classifies test data into Equivalence classes as positive Equivalence classes and negative Equivalence classes</strong>, such classification ensures both positive and negative conditions are tested.</p></li></ul><h3 id="Explain-how-CDN-work"><a href="#Explain-how-CDN-work" class="headerlink" title="Explain how CDN work"></a>Explain how CDN work</h3><ul><li>Definition: <strong>CDN(Content dilivery network/Content distributed network) is a geographically distributed network</strong> of proxy servers and their data centers. The goal is to distribute service spatially relative to end-users to <strong>provide high availability and high performance</strong>.</li><li>Improve performance in two ways:<ul><li>Users receive content at data centers close to them.</li><li>Your servers do not have to serve requests that the CND fulfills</li></ul></li></ul><p>To minimize the distance between the visitors and your website’s server, a CDN stores a cached version of its content in multiple geographical locations (a.k.a., points of presence, or PoPs). Each PoP contains a number of caching servers responsible for content delivery to visitors within its proximity.</p><p>In essence, CDN puts your content in many places at once, providing superior coverage to your users. For example, when someone in London accesses your US-hosted website, it is done through a local UK PoP. This is much quicker than having the visitor’s requests, and your responses, travel the full width of the Atlantic and back.<br><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fymv8bwce7j20d00kqgne.jpg" alt="CDN"></p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference:"></a>Reference:</h2><ul><li><a href="https://www.iplocation.net/public-vs-private-ip-address">What is the difference between public and private IP address?</a></li><li><a href="http://www.differencebetween.net/technology/software-technology/difference-between-git-and-svn/">Difference Between Git and SVN</a></li><li><a href="https://www.tutorialspoint.com/design_pattern/mvc_pattern.htm">Design Patterns - MVC Pattern</a></li><li><a href="https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller">Model-view-controller</a></li><li><a href="https://en.wikipedia.org/wiki/IP_address">IP address</a></li><li><a href="https://en.wikipedia.org/wiki/Port_(computer_networking">Port(computer networking)</a>)</li><li><a href="https://en.wikipedia.org/wiki/Domain_Name_System">Domain Name System</a></li><li><a href="https://en.wikipedia.org/wiki/Content_delivery_network">Content delivery network/Content distributed network</a></li><li><a href="https://en.wikipedia.org/wiki/WebSocket">Websocket wiki</a></li><li><a href="https://www.tornadoweb.org/en/stable/websocket.html">tornado.websocket — Bidirectional communication to the browser</a></li><li><a href="https://wenku.baidu.com/view/df90877bf121dd36a22d827d.html">网络常见协议及端口号</a></li><li><a href="https://www.cloudflare.com/learning/dns/what-is-dns/">How does DNS work</a></li><li><a href="http://freewimaxinfo.com/routing-protocol-types.html">Routing Protocols Types (RIP, IGRP, OSPF, EGP, EIGRP, BGP, IS-IS)</a></li><li><a href="https://www.virtually-limitless.com/vcix-nv-study-guide/configure-dynamic-routing-protocols-ospf-bgp-is-is/">Configure dynamic routing protocols: OSPF, BGP, IS-IS</a></li><li><a href="https://stackoverflow.com/questions/48396690/docker-vs-virtual-machine">Docker vs Virtual Machine</a></li><li><a href="https://www.geeksforgeeks.org/types-software-testing/">Types of Software Testing</a></li><li><a href="https://www.testingexcellence.com/white-box-testing/">white-box-testing</a></li><li><a href="https://www.testingexcellence.com/types-of-software-testing-complete-list/">types-of-software-testing-complete-list</a></li><li><a href="https://blog.csdn.net/ITer_ZC/article/details/40748587">聊聊高并发专栏</a></li></ul>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;Question-distribution&quot;&gt;&lt;a href=&quot;#Question-distribution&quot; class=&quot;headerlink&quot; title=&quot;
      
    
    </summary>
    
      <category term="BUPT" scheme="https://824zzy.github.io/categories/BUPT/"/>
    
    
      <category term="exam" scheme="https://824zzy.github.io/tags/exam/"/>
    
  </entry>
  
  <entry>
    <title>信息论-信道容量迭代算法python实现版</title>
    <link href="https://824zzy.github.io/2018/12/26/information-theory-channel-capacity-iteration-algorithm/"/>
    <id>https://824zzy.github.io/2018/12/26/information-theory-channel-capacity-iteration-algorithm/</id>
    <published>2018-12-26T15:08:17.000Z</published>
    <updated>2018-12-27T08:24:04.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Talk-is-Cheap-Let-me-show-you-the-code"><a href="#Talk-is-Cheap-Let-me-show-you-the-code" class="headerlink" title="Talk is Cheap, Let me show you the code"></a>Talk is Cheap, Let me show you the code</h2><p><a href="https://colab.research.google.com/drive/1oFkI8WYPzQhvC1FLV7Fw8NR70EqNtENc">传送门</a></p>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;Talk-is-Cheap-Let-me-show-you-the-code&quot;&gt;&lt;a href=&quot;#Talk-is-Cheap-Let-me-show-you-th
      
    
    </summary>
    
      <category term="BUPT" scheme="https://824zzy.github.io/categories/BUPT/"/>
    
    
      <category term="exam" scheme="https://824zzy.github.io/tags/exam/"/>
    
      <category term="infomationTheory" scheme="https://824zzy.github.io/tags/infomationTheory/"/>
    
  </entry>
  
  <entry>
    <title>Neuralink and Brains&#39;Magical Future</title>
    <link href="https://824zzy.github.io/2018/12/25/Neuralink-and-Brains-Magical-Future/"/>
    <id>https://824zzy.github.io/2018/12/25/Neuralink-and-Brains-Magical-Future/</id>
    <published>2018-12-25T20:54:22.000Z</published>
    <updated>2019-01-14T16:19:22.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h3 id="Part1-The-Human-Colossus"><a href="#Part1-The-Human-Colossus" class="headerlink" title="Part1: The Human Colossus"></a>Part1: The Human Colossus</h3><p>In this part, the brief history of humanbeing is displayed. The process of evolution:</p><ol><li>Sponge(600 Million BC): Data is just like saved in <code>Cache</code>~</li><li>Jellyfish(580 Million BC): The first animal has <code>nerves net</code> to save data from environment. Note that nerves net not only exist in its head but also in the whole body.</li><li>Flatworm(550 Million BC) and Frog(265 Million BC): The flatworm has nervous system in charge of everything.</li><li>Rodent(225 Million BC) and Tree mammal(80 Million BC): More complex animals.</li><li>Hominid(4 Million BC): The early version of neocortex. Hominid could think(complex thoughts, reason through decisions, long-term plans). When language had appeared, knowledges are saved in an intricate system(neural net). Homimid already has enough knowledge from their ancestors.</li><li>Computer Colossus(1990s): Computer network that can not learning to think.</li></ol><h3 id="Part2-The-Brain"><a href="#Part2-The-Brain" class="headerlink" title="Part2: The Brain"></a>Part2: The Brain</h3><p>Three membranes around brain: dura mater, arachnoid mater, pia mater.<br>Looking into brain, there are three parts: neomammalian, paleomammalian, reptilian.</p><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fz4u66o1a6j20ob0ll4qp.jpg" alt=""></p><ol><li>The Reptilian Brain(爬行脑): the brain stem<ol><li>The medulla[mi’dula] oblongata[abon’gata] (延髓): control involuntary things like heart rate, breathing, and blood pressure.</li><li>The pons(脑桥): generate actions about the little things like bladder control, facial expressions.</li><li>The mid brain(中脑): eyes moving.</li><li>The cerebellum(小脑): Stay balanced.</li></ol></li></ol><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fz4ukscpc0j20j50h9wst.jpg" alt=""></p><ol><li>The Paleo-Mammalian Brain(古哺乳脑): the limbic system(边缘脑)<ol><li>The amygdala(杏仁核): deal with anxiety, fear, happy feeling.</li><li>The hippocampus(海马体): a board for memory to direction.</li><li>The thalamus(丘脑): sensory middleman that receives information from your sensory organ and sends them to your cortex for processing.</li></ol></li></ol><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fz502o3cl3j20m80fntl1.jpg" alt=""></p><ol><li>The Neo-Mammalian Brain(新哺乳脑): The Cortex(皮质)<ol><li>The frontal lobe(前叶): Handle with reasoning, planning, executive function. And <strong>the adult in your head</strong> call <strong>prefrontal cortex</strong>(前额皮质).</li><li>The parietal lobe(顶叶): Controls sense of touch.</li><li>The temporal lobe(额叶): where your memory lives</li><li>The occipital lobe(枕叶): entirely dedicated to vision.</li></ol></li></ol><p>Inspiration from neural nets:<br><code>Neuroplasticity</code>: Neurons’ ability to alter themselves chemically, structurally, and even functionally, allow your brain’s neural network to optimize itself to the external world. Neuroplasticity makes sure that human can grow and change and learn new things throughout their whole lives.</p><h3 id="Reference"><a href="#Reference" class="headerlink" title="Reference:"></a>Reference:</h3><ul><li><a href="https://waitbutwhy.com/2017/04/neuralink.html">neuralink from wait but why</a></li></ul>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h3 id=&quot;Part1-The-Human-Colossus&quot;&gt;&lt;a href=&quot;#Part1-The-Human-Colossus&quot; class=&quot;headerlink&quot; t
      
    
    </summary>
    
      <category term="AI" scheme="https://824zzy.github.io/categories/AI/"/>
    
    
      <category term="agi" scheme="https://824zzy.github.io/tags/agi/"/>
    
  </entry>
  
  <entry>
    <title>意识先验</title>
    <link href="https://824zzy.github.io/2018/12/09/ConsciousnessPrior-Bengio/"/>
    <id>https://824zzy.github.io/2018/12/09/ConsciousnessPrior-Bengio/</id>
    <published>2018-12-10T00:53:33.000Z</published>
    <updated>2018-12-09T11:27:34.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h1 id="意识先验理论"><a href="#意识先验理论" class="headerlink" title="意识先验理论"></a>意识先验理论</h1><h2 id="如何理解意识先验"><a href="#如何理解意识先验" class="headerlink" title="如何理解意识先验"></a>如何理解意识先验</h2><p>首先，意识先验这篇论文没有实验结果，是一篇纯粹的开脑洞的、理论性的文章。</p><p>论文中提到的意识先验更多的是对<strong>不同层次</strong>的信息的<strong>表征</strong>提取。例如：人类创造了高层次的概念，如符号（自然语言）来简化我们的思维。</p><p>2007 年，Bengio 与 Yann LeCun 合著的论文着重强调表征必须是多层的、逐渐抽象的。13年，Bengio 在综述论文中，增加了对解纠缠（Disentangling）的强调。</p><h3 id="RNN是个很好的例子"><a href="#RNN是个很好的例子" class="headerlink" title="RNN是个很好的例子"></a>RNN是个很好的例子</h3><p>RNN的隐藏状态包含一个低维度的子状态，可以用来解释过去，帮助预测未来，也可以作为自然语言来呈现。</p><p><img src="https://github.com/824zzy/blogResources/blob/master/picResources/ConsciousnessPrior.png?raw=true" alt="意识先验网络示意图"></p><h2 id="表征RNN（Representation-RNN-F）"><a href="#表征RNN（Representation-RNN-F）" class="headerlink" title="表征RNN（Representation RNN / F）"></a>表征RNN（Representation RNN / F）</h2><p>$$h_t = F(s_t,h_t−1)$$</p><p>Bengio提出表征RNN($F$)和表征状态$h_t$。其中$F$是包含了大脑中所有的神经连接权重。它们可以看作是我们的知识和经验，将一种表示状态映射到另一种表示状态。</p><p>表征RNN与一个人在不同环境学习到的知识、学识和经验相对应。即使有相同的$F$, 人们的反应和未来的想法也会不尽相同。表征状态$h_t$对应大脑所有神经元状态的聚合。并且他们可以被看作是当时环境（最底层信息）的表征。</p><h2 id="意识RNN-Consciousness-RNN-C"><a href="#意识RNN-Consciousness-RNN-C" class="headerlink" title="意识RNN (Consciousness RNN / C)"></a>意识RNN (Consciousness RNN / C)</h2><p>$$c_t=C(h<em>t,c</em>{t-1},z_t)$$</p><p>没有人能够有意识地体会到大脑里所有神经元是如何运作的。因为只有一小部分神经元与大脑此时正在思考的想法和概念相对应。因此意识是大脑神经元一个小的子集，或者说是副产品（by-product）。</p><p>因此Bengio认为，意识RNN本身应该包含某种注意力机制（当前在神经机器翻译中使用的)。他引入注意力作为额外的机制来描述大脑选择关注什么，以及如何预测或行动。</p><p>简而言之，意识RNN应该只“注意”意识向量更新自身时的重要细节，以<strong>减少计算量</strong>。</p><h2 id="验证网络（Verifier-Network-V）"><a href="#验证网络（Verifier-Network-V）" class="headerlink" title="验证网络（Verifier Network / V）"></a>验证网络（Verifier Network / V）</h2><p>$$V(h<em>t,c</em>{t-k})\in R$$</p><p>Bengio的思想还包含了一种训练方法，他称之为验证网络$V$。网络的目标是将当前的$h<em>t$表示与之前的意识状态$c</em>{t-k}$相匹配。在他的设想中可以用变分自动编码器(VAE)或GAN进行训练。</p><h2 id="语言与符号主义的联结"><a href="#语言与符号主义的联结" class="headerlink" title="语言与符号主义的联结"></a>语言与符号主义的联结</h2><p>深度学习的主要目标之一就是设计出能够习得更好表征的算法。好的表征理应是高度抽象的、高维且稀疏的，但同时，也能和自然语言以及符号主义 AI 中的『高层次要素』联系在一起。</p><p>语言和符号人工智能的联系在于：语言是一种“选择性的过程”，语言中的语句可以忽略世界上的大部分细节，而专注于少数。符号人工智能只需要了解世界的一个特定方面，而不是拥有一切的模型。</p><p>Bengio关于如何使这一点具体化的想法是：先有一个“意识”，它迫使一个模型拥有不同类型的“意识流”，这些“意识流”可以独立运作，捕捉世界的不同方面。例如，如果我在想象与某人交谈，我对那个人、他们的行为以及我与他们的互动有一种意识，但我不会在那一刻对我的视觉流中的所有像素进行建模。</p><h3 id="思考：快与慢"><a href="#思考：快与慢" class="headerlink" title="思考：快与慢"></a>思考：快与慢</h3><p>人类的认知任务可以分为系统 1 认知（System 1 cognition）和系统 2 认知（System 2 cognition）。系统 1 认知任务是那些你可以在不到 1 秒时间内无意识完成的任务。例如你可以很快认出手上拿着的物体是一个瓶子，但是无法向其他人解释如何完成这项任务。这也是当前深度学习擅长的事情，「感知」。<br>系统 2 认知任务与系统 1 任务的方式完全相反，它们很「慢」且有意识。例如计算「23*56」，大多数人需要有意识地遵循一定的规则、按照步骤完成计算。完成的方法可以用语言解释，而另一个人可以理解并重现。这是算法，是计算机科学的本意，符号主义 AI 的目标，也属于此类。<br>人类联合完成系统 1 与系统 2 任务，人工智能也理应这样。</p><h2 id="还有很多问题需要解决"><a href="#还有很多问题需要解决" class="headerlink" title="还有很多问题需要解决"></a>还有很多问题需要解决</h2><h3 id="训练的目标函数是什么？"><a href="#训练的目标函数是什么？" class="headerlink" title="训练的目标函数是什么？"></a>训练的目标函数是什么？</h3><p>标准的深度学习算法的目标函数通常基于最大似然，但是我们很难指望最大似然的信号能够一路经由反向传播穿过用于预测的网络，穿过意识RNN，最终到达表征 RNN。</p><p>最大似然与意识先验的思想天然存在冲突。「人类从不在像素空间进行想象与生成任务，人类只在高度抽象的语义空间使用想象力，生成一张像素级的图像并非人类需要完成的任务。」因此，在训练目标里引入基于表征空间的项目就变得顺理成章。<strong>不在原始数据空间内定义目标函数</strong></p><h3 id="梯度下降是否适用于意识先验？"><a href="#梯度下降是否适用于意识先验？" class="headerlink" title="梯度下降是否适用于意识先验？"></a>梯度下降是否适用于意识先验？</h3><blockquote><p>Jaderberg, M., Czarnecki, W. M., Osindero, S., Vinyals, O., Graves, A., Silver, D., &amp; Kavukcuoglu, K. (2016). Decoupled neural interfaces using synthetic gradients. arXiv preprint arXiv:1608.05343.</p></blockquote><p>除了目标函数之外，意识先验的优化方式也会和经典深度学习有所不同。Bengio： 什么样的优化方式最适合意识先验？我仍然不知道这个问题的答案。<br>在他看来，一类很有前景的研究是合成梯度（synthetic gradient）。</p><p>有了合成梯度之后，每一层的梯度可以单独更新了。但是当时间步继续拉长，问题仍然存在。理论上反向传播可以处理相当长的序列，但是鉴于人类处理时间的方式并非反向传播，可以轻松跨越任意时长，等「理论上」遇到一千乃至一万步的情况，实际上就不奏效了。</p><h3 id="信用分配的仍然是最大的问题"><a href="#信用分配的仍然是最大的问题" class="headerlink" title="信用分配的仍然是最大的问题"></a>信用分配的仍然是最大的问题</h3><blockquote><p>Ke, N. R., Goyal, A., Bilaniuk, O., Binas, J., Mozer, M. C., Pal, C., &amp; Bengio, Y. (2018). Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding. arXiv preprint arXiv:1809.03702.</p></blockquote><p>换言之，我们对时间的信用分配（credit assignment）问题的理解仍然有待提高。「比如你在开车的时候听到『卟』的一声，但是你没在意。三个小时之后你停下车，看到有一个轮胎漏气了，立刻，你的脑海里就会把瘪轮胎和三小时前的『卟』声联系起来——不需要逐个时间步回忆，直接跳到过去的某个时间，当场进行信用分配。」。受人脑的信用分配方式启发，Bengio 的团队尝试了一种稀疏注意回溯（Sparse Attentive Backtracking）方法。「我们有一篇关于时间信用分配的工作，是 NIPS 2018 的论文，能够跳过成千上万个时间步，利用对记忆的访问直接回到过去——就像人脑在获得一个提醒时所作的那样——直接对一件事进行信用分配。」</p><h2 id="关于意识先验的代码"><a href="#关于意识先验的代码" class="headerlink" title="关于意识先验的代码"></a>关于意识先验的代码</h2><ul><li>论文：<a href="https://ai-on.org/pdf/bengio-consciousness-prior.pdf">Experiments on the Consciousness Prior</a></li><li>代码：<a href="https://github.com/AI-ON/TheConsciousnessPrior/tree/master/src">TheConsciousnessPrior github</a></li></ul><h2 id="参考与引用"><a href="#参考与引用" class="headerlink" title="参考与引用"></a>参考与引用</h2><ul><li><a href="http://thegrandjanitor.com/2018/05/09/a-read-on-the-consciousness-prior-by-prof-yoshua-bengio/">A READ ON “THE CONSCIOUSNESS PRIOR” BY PROF. YOSHUA BENGIO</a></li><li><a href="https://www.quora.com/What-is-Yoshua-Bengios-new-Consciousness-Prior-paper-about">What is Yoshua Bengio’s new “Consciousness Prior” paper about?</a></li><li><a href="https://www.reddit.com/r/MachineLearning/comments/72h5zf/r_the_consciousness_prior/">reddit</a></li><li><a href="https://www.jiqizhixin.com/articles/2018-11-29-7">Yoshua Bengio访谈笔记：用意识先验糅合符号主义与联结主义</a></li></ul>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h1 id=&quot;意识先验理论&quot;&gt;&lt;a href=&quot;#意识先验理论&quot; class=&quot;headerlink&quot; title=&quot;意识先验理论&quot;&gt;&lt;/a&gt;意识先验理论&lt;/h1&gt;&lt;h2 id=
      
    
    </summary>
    
      <category term="AI" scheme="https://824zzy.github.io/categories/AI/"/>
    
    
      <category term="agi" scheme="https://824zzy.github.io/tags/agi/"/>
    
      <category term="bengio" scheme="https://824zzy.github.io/tags/bengio/"/>
    
  </entry>
  
  <entry>
    <title>最大概率汉语切分作业</title>
    <link href="https://824zzy.github.io/2018/12/06/Chinese-segmentation-Homework/"/>
    <id>https://824zzy.github.io/2018/12/06/Chinese-segmentation-Homework/</id>
    <published>2018-12-06T17:16:41.000Z</published>
    <updated>2018-12-13T06:12:30.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Talk-is-Cheap-Let-me-show-you-the-code"><a href="#Talk-is-Cheap-Let-me-show-you-the-code" class="headerlink" title="Talk is Cheap, Let me show you the code"></a>Talk is Cheap, Let me show you the code</h2><p><a href="https://colab.research.google.com/drive/1ns3HetlP-8Np6GdF-mjaa0zIaB8k9lDG#scrollTo=XX1EqIqD3Y9q">传送门</a></p>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;Talk-is-Cheap-Let-me-show-you-the-code&quot;&gt;&lt;a href=&quot;#Talk-is-Cheap-Let-me-show-you-th
      
    
    </summary>
    
      <category term="NLP" scheme="https://824zzy.github.io/categories/NLP/"/>
    
    
      <category term="chineseSegmentation" scheme="https://824zzy.github.io/tags/chineseSegmentation/"/>
    
      <category term="bupt" scheme="https://824zzy.github.io/tags/bupt/"/>
    
  </entry>
  
  <entry>
    <title>数据挖掘文本分类作业</title>
    <link href="https://824zzy.github.io/2018/12/06/DataMining-Homework/"/>
    <id>https://824zzy.github.io/2018/12/06/DataMining-Homework/</id>
    <published>2018-12-06T17:16:41.000Z</published>
    <updated>2018-12-14T08:26:31.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Talk-is-Cheap-Let-me-show-you-the-code"><a href="#Talk-is-Cheap-Let-me-show-you-the-code" class="headerlink" title="Talk is Cheap, Let me show you the code"></a>Talk is Cheap, Let me show you the code</h2><p><a href="https://colab.research.google.com/drive/1AIzOZinBCn7iHo8Dx6AgLMRBzy2WglA-#scrollTo=ZPYreCTDWQB_&amp;uniqifier=4">传送门</a></p>]]></content>
    
    <summary type="html">
    
      
      
        &lt;script src=&quot;/assets/js/APlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;Talk-is-Cheap-Let-me-show-you-the-code&quot;&gt;&lt;a href=&quot;#Talk-is-Cheap-Let-me-show-you-th
      
    
    </summary>
    
      <category term="ML" scheme="https://824zzy.github.io/categories/ML/"/>
    
    
      <category term="bupt" scheme="https://824zzy.github.io/tags/bupt/"/>
    
      <category term="dataMining" scheme="https://824zzy.github.io/tags/dataMining/"/>
    
  </entry>
  
</feed>