LearnX


A predictive algorithm to identify subjects and main topics from transcripts of online lectures.

begin

How it works


LearnX uses natural language processing within machine learning to identify subjects and topics.

Wikipedia glossary pages on different subjects are used to train a Naive Bayes classifier are used to identify the main subjects of each lecture.

Lecture text can be pasted into the algorithm for analysis. A script that can download upto 10 lectures from MIT OCW website is also available.

The texts are transformed into a bag of words using Tf-idf vectorizer and fed to the classifier to predict the subject of each lecture.

Non-Negative Matrix Factorization (NMF) is used to extract topics from each lecture.

Next

Learn from Wikipedia


To train the algorithm we use wikipedia glossary pages. Following pages will be used by default. Add or remove pages as necessary.

Click LEARN when ready.

1 physics
GO
2 chemistry
GO
3 biology
GO
4 probability and statistics
GO
5 elementary quantum mechanics
GO
6 classical physics
GO
7 gene expression terms
GO
8 artificial intelligence
GO
9 astronomy
GO
10 game theory
GO
w