Expectation Maximization is a very generic inference scheme that has many declination depending on the structure of the models it is applied to. For instance there we already have Gaussian Mixture Models that can be fitted using EM in sklearn. There is also an implementation of k-means which can also be interpreted as an EM algorithm.
Congrats on the project! It's really interesting to see electrical engineering tools using web and modern concepts instead of the usual the nasty-windows-only-IDE packaging.
This is great to hear! I used a similar construction on my final project (electrical engineering) but left the appropriate mathematical proof (if any) into the Future Works section. The classification rule I found to be better was:
c - best class
C - set of classes
P(c) - prior prob. of class c
P(d_i|c) - conditional of the word d_i on the class c and
N_{d,d_i} - frequency of word d_i in document d
Excuse me for the possibly weird notation.
Virtually this expression means it places a frequency power upon the word probability, which is basically assuming (naively, I think) "words that are more likely to occur should be empowered".
I tried to use it for a while but gave up when I couldn't find a simple way of finding my twitter follows that also have a Subjot account. Did anyone tried/succeeded doing this?
We don't have a solid design for that but we've definitely kicked around the idea. If I want to find the top jotters on NFL, it would be great to see some sort of ranked list.
Oh awesome I didn't know about that I'll definitely look into it. Do you think people would be more likely to use it because they can type 'git' instead of 'tigger' (or whatever name it ends up being)?
It would definitely fit into git's naming convention for commands better. Whether or not that makes it stickier with users is probably an question only focus testing can answer :)