Quiz #3 has been posted, and is due Monday at midnight! We’ll have covered all the material for this quiz by Thursday the 23rd.
-
Quiz #3 posted!
-
It’s the Jacobian, not the Hessian
I misspoke today in response to Garrett’s question about a vector-valued loss function (instead of a scalar loss function). If your loss (or any other) function is a vector of values, then computing the partial derivative of each of those values with respect to each of those inputs is called the Jacobian matrix. It’s normally denoted as \( J_{\!f}(x) \), and its entries are \( J_{{\!f}_{ij}}(x) = \frac{\partial f_i}{\partial x_j} \).
The Hessian matrix, \( H_{\!f}(x) \), is actually similar to the Jacobian but has second-order partial derivatives. (In other words, its entries are \( H_{{\!f}_{ij}}=\frac{\partial^2 f}{\partial x_i\,\partial x_j} \).)
-
Today’s code posted
I have pushed to the class github repo our code from today (see the file demo_autodiff.py.)
Btw, I may have completely forgotten to mention the name of the awesome algorithm used to systematically back-compute the partial derivatives of the loss function with respect to all the model inputs. It is called autodiff. In a humorous twist, the people at Meta who developed PyTorch apparently misheard the name and thought it was “autograd” (which makes sense, actually, since the gradient is precisely the vector containing all those partial derivatives) and so you will see references throughout the PyTorch docs to “autograd.” I prefer to use the original name.
-
XP cards
I forgot to say that I would accept people’s XP cash cards today as a mid-semester cash out. So how about this: if you want to cash in your cards mid-semester, you can do so next Thursday (right after fall break).
(Note that if you don’t turn them in mid-semester, there is no grade disadvantage: it just means the points won’t appear on the scoreboard until December, and it means that you have to keep track of your cards for that much longer.)
-
Quiz #2 posted!
Quiz #2 has been posted, and is open-Python and timed at 60 minutes.
So as not to rush anybody, I made it due on Oct. 15th instead of Oct. 10th. But we’ve already covered everything needed for the quiz.
-
logreg.py (and logreg_distilled.py) posted
In the class git repo.
-
Office hours time change — 10/7
On Tuesday the 7th, my office hours will be 1:30-3:30pm instead of the normal 12-2pm.
-
Homework #3 errors
A student has just pointed out a couple errors in pytorch_practice.py, which are now fixed. If you’ve already git pulled (or copied the contents of that file some other way), then git pull again (or re-copy).
-
Homework #3 posted!
As promised, Homework #3 has been posted, and is due on October 17th at midnight. It is a play in two acts. Send questions!
Also, I offer +5XP to anyone who finds a legit bug in my co-occurrence code or supporting programs and reports it! Only the first person who reports any specific bug gets the reward for finding that bug, though.
And yes, these rewards can be stacked! (just like tensors!)
-
Visualizing embeddings
I’ve pushed several files to the class repo, including two programs to help you visualize the embeddings in your corpus: interact_cooccur.py, which we played with in class on Tuesday, and visualize_cooccur.py, which can produce 2-d (and even 3-d) plots like this showing the embeddings in a reduced-dimensional space:

The next homework assignment (coming soon) will have you running and configuring these programs to help you analyze your own corpus’s embeddings. Stay tuned for that.
Also, I have posted the code we used to play around with standard pre-trained embedding collections (like word2vec and GloVe): you’ll need to first run the download_embeddings.py file (while connected to a good network) and then run either sim_emb_play.py or closest_emb_play.py to find the similarity of pairs of words, or the top-10 closest embeddings to a given word, respectively.

