<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
      <title>projects on Roy Tang</title>
      <link>https://mirror.roytang.net/tags/projects/</link>
      <description>Recent content in projects on Roy Tang</description>
      <generator>Hugo -- gohugo.io</generator>
      <language>en-us</language>
      <managingEditor>hello@roytang.net (Roy Tang)</managingEditor>
      <webMaster>hello@roytang.net (Roy Tang)</webMaster>
      <lastBuildDate>Sun, 03 Feb 2019 05:56:56 +0000</lastBuildDate>
      
          <atom:link href="https://mirror.roytang.net/tags/projects/index.xml" rel="self" type="application/rss+xml" />
      
          
      
        <item>
            <title>TriviaStorm: Text and Answer parsing
</title>
            <link>https://mirror.roytang.net/2019/02/triviastorm-text-and-answer-parsing/</link>
            <pubDate>Sun, 03 Feb 2019 05:56:56 +0000</pubDate>
            <author>hello@roytang.net (Roy Tang)</author>
            <guid>https://mirror.roytang.net/2019/02/triviastorm-text-and-answer-parsing/</guid>
            <description>
            
            &lt;p&gt;A while back I started a &lt;a href=&#34;https://mirror.roytang.net/2017/02/weekend-project-twitter-trivia-bot/&#34;&gt;Twitter trivia bot as a weekend project&lt;/a&gt;. That bot is still &lt;a href=&#34;https://twitter.com/triviastorm&#34;&gt;up and running on Twitter&lt;/a&gt;, you can check it out there!&lt;/p&gt;
&lt;p&gt;But today, I thought I&amp;rsquo;d write about the answer-checking mechanism used by the bot. It was a bit interesting to me because it was the first nontrivial use I had for &lt;a href=&#34;https://docs.djangoproject.com/en/dev/internals/contributing/writing-code/unit-tests/&#34;&gt;Django&amp;rsquo;s unit testing framework&lt;/a&gt;. I&amp;rsquo;m not too keen on unit testing web functionality (something I still have to learn), but this seemed an appropriate first use of a unit test framework for several reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the bot had to be able to handle a wide variety of answers&lt;/li&gt;
&lt;li&gt;there were a lot of test cases to check and a single checking function handling everything - I couldn&amp;rsquo;t risk breaking previous working tests&lt;/li&gt;
&lt;li&gt;inputs were discrete and outputs were easily checkable&lt;/li&gt;
&lt;li&gt;I needed to be able to add new test scenarios all the time as more problematic answers were provided&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The project currently isn&amp;rsquo;t open source, but I did make a gist of the &lt;code&gt;tests.py&lt;/code&gt; I used &lt;a href=&#34;https://gist.github.com/roytang/9199962097bdf3ca2aa8ec9c43bd7ef8&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;check&lt;/code&gt; function basically accepts three parameters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a checking mode (currently only supports EXACT and ALL_ANYORDER)&lt;/li&gt;
&lt;li&gt;the answer phrase provided by the player&lt;/li&gt;
&lt;li&gt;a set of valid answers accepted for the question&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There&amp;rsquo;s a number of test cases already handled:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;checking should be case-insensitive&lt;/li&gt;
&lt;li&gt;articles should be ignored if they&amp;rsquo;re at the start of the answer phrase&lt;/li&gt;
&lt;li&gt;numbers should be acceptable for the spelled-out versions i.e. &amp;ldquo;7&amp;rdquo; should be accepted for &amp;ldquo;seven&amp;rdquo;, and vice versa&lt;/li&gt;
&lt;li&gt;some minor soundex (phonetic matching) support (via &lt;a href=&#34;https://github.com/jamesturk/jellyfish&#34;&gt;Python Jellyfish&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;handling of questions that support multiple answers. This is what ALL_ANYORDER is for - it means that all the given answers must be provided, but they can be in any order. i.e. if the valid answer set is &lt;code&gt;&amp;quot;Huey&amp;quot;&lt;/code&gt;, &lt;code&gt;&amp;quot;Dewey&amp;quot;&lt;/code&gt; and &lt;code&gt;&amp;quot;Louie&amp;quot;&lt;/code&gt;, then &lt;code&gt;&amp;quot;Louie Huey Dewey&amp;quot;&lt;/code&gt; should be accepted as an answer&lt;/li&gt;
&lt;li&gt;nonalphanumeric characters should be treated as whitespace, except in some special cases&lt;/li&gt;
&lt;li&gt;special case: abbreviations like &lt;code&gt;&amp;quot;don&#39;t&amp;quot;&lt;/code&gt; or &lt;code&gt;&amp;quot;can&#39;t&amp;quot;&lt;/code&gt; should be treated as if they were a single term like &lt;code&gt;&amp;quot;dont&amp;quot;&lt;/code&gt; or &lt;code&gt;&amp;quot;cant&amp;quot;&lt;/code&gt; instead of &lt;code&gt;&amp;quot;don t&amp;quot;&lt;/code&gt; or &lt;code&gt;&amp;quot;can t&amp;quot;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The answer checking definitely still isn&amp;rsquo;t perfect, but I&amp;rsquo;m pretty happy with where it&amp;rsquo;s at right now. There is also definitely an element of subjectivity as to which answers should be accepted. One time a player complained that his answer &lt;code&gt;&amp;quot;Batman vs Superman Dawn of Justice&amp;quot;&lt;/code&gt; should count for &lt;code&gt;&amp;quot;Batman v Superman: Dawn of Justice&amp;quot;&lt;/code&gt;, but for this particular question I had chosen not to allow &amp;ldquo;vs&amp;rdquo; for &amp;ldquo;v&amp;rdquo; because that was the actual movie title, which might be unreasonable now that I think about it!&lt;/p&gt;
&lt;p&gt;I do know that I need to implement a better &amp;ldquo;synonym&amp;rdquo; handling, i.e. mapping of &lt;code&gt;&amp;quot;v&amp;quot;&amp;lt;-&amp;gt;&amp;quot;vs&amp;quot;&lt;/code&gt; and other terms like &lt;code&gt;&amp;quot;mr&amp;quot;&amp;lt;-&amp;gt;&amp;quot;mister&amp;quot;&lt;/code&gt; or &lt;code&gt;&amp;quot;natl&amp;quot;&amp;lt;-&amp;gt;&amp;quot;national&amp;quot;&lt;/code&gt;. The problem with handling things like that is that is the possible combinations of phrases expands when multiple such terms are found in the same answer, so it can&amp;rsquo;t scale too well. I suppose I need to normalize the answer sets at the time the question is defined. What do you know, I figured out how to do something just by writing a blog post!&lt;/p&gt;
&lt;p&gt;I do have a bunch of other enhancements planned for the trivia bot, including support for slack and discord, and a longer time frame roadmap, but I&amp;rsquo;m not sure when I can commit more time to it. Still, it&amp;rsquo;s turned out to be a pretty fun endeavor, I&amp;rsquo;m hoping it leads to something cool!&lt;/p&gt;



            </description>
        </item>
    
        <item>
            <title>Django Blog Application
</title>
            <link>https://mirror.roytang.net/2018/10/django-blog-application/</link>
            <pubDate>Sun, 28 Oct 2018 05:02:37 +0000</pubDate>
            <author>hello@roytang.net (Roy Tang)</author>
            <guid>https://mirror.roytang.net/2018/10/django-blog-application/</guid>
            <description>
            
            &lt;p&gt;Ten years ago this month, I started studying &lt;a href=&#34;https://www.djangoproject.com/&#34;&gt;Django&lt;/a&gt; by trying to build my own blog application. I found the code lying around while I was going through some backups lately. It&amp;rsquo;s way out of date, it uses an early version of django. I thought of bringing it up to speed, but that didn&amp;rsquo;t seem practical. Instead, for archival purposes, I cleaned it up a bit and &lt;a href=&#34;https://github.com/roytang/django-blog&#34;&gt;uploaded the code to a github repo&lt;/a&gt;. (Helpful github immediately warned me that having a very old version of Django was a security risk lol). There&amp;rsquo;s a lot more information in the README.md of that repo. I actually used this as my main blog engine for a while before I decided the maintenance effort wasn&amp;rsquo;t worth it and switched to WordPress.&lt;/p&gt;
&lt;p&gt;Tangent: I&amp;rsquo;m not actually 100% happy with WordPress, and when I found this old code I was tempted to maybe trying building Yet Another Blog Application (TM), except maybe using the opportunity to learn some other framework. My 2018 personal to-do list is already way too long though.&lt;/p&gt;



            </description>
        </item>
    
        <item>
            <title>Weekend Project: Twitter Trivia Bot
</title>
            <link>https://mirror.roytang.net/2017/02/weekend-project-twitter-trivia-bot/</link>
            <pubDate>Thu, 23 Feb 2017 01:30:00 +0000</pubDate>
            <author>hello@roytang.net (Roy Tang)</author>
            <guid>https://mirror.roytang.net/2017/02/weekend-project-twitter-trivia-bot/</guid>
            <description>
            
            &lt;p&gt;I had been meaning to try writing a Twitter bot for a while now. I figured a trivia bot would be pretty easy to implement, so I spent some time a couple of weekends to rig one together.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s (mostly) working now, the bot is active as &lt;a href=&#34;https://twitter.com/triviastorm/&#34;&gt;triviastorm on Twitter&lt;/a&gt;, with a supporting webapp deployed on &lt;a href=&#34;https://triviastorm.net/&#34;&gt;https://triviastorm.net/&lt;/a&gt;. The bot tweets out a trivia question once every hour. It will then award points to the first five people who gave the correct answer. The bot will only recognize answers given as a direct reply to the tweet with the question, and only those submitted within the one hour period.&lt;/p&gt;
&lt;p&gt;Some technical details:&lt;/p&gt;
&lt;p&gt;My scripting language of choice for the past few years has been Python 2.7. I&amp;rsquo;m using &lt;a href=&#34;http://www.tweepy.org/&#34;&gt;Tweepy&lt;/a&gt; to interact with the Twitter API, &lt;a href=&#34;https://github.com/PyMySQL/PyMySQL&#34;&gt;PyMySQL&lt;/a&gt; to connect to the database, and &lt;a href=&#34;http://flask.pocoo.org/&#34;&gt;Flask&lt;/a&gt; to run the webapp. I haven&amp;rsquo;t used Flask in some time, but it&amp;rsquo;s still very straightforward. I actually had a harder time configuring the webapp with mod_wsgi on my host.&lt;/p&gt;
&lt;p&gt;The main problem with a trivia system is that you need a large and high-quality set of questions. Right now the bot is using a small trivia set &amp;ndash;around a thousand questions I got from a variety of sources. If I want to leave this bot running for a while, I&amp;rsquo;m going to need a much larger trivia set. However, reviewing and collating the questions is a nontrivial task. Hopefully I can add new questions every so often.&lt;/p&gt;
&lt;p&gt;Feel free to follow the bot and help test it out. I&amp;rsquo;d be grateful!&lt;/p&gt;



            </description>
        </item>
    
    </channel>
  </rss>