In this edition of QuEST, Michael Robinson will discuss topological features in large language models
Key Moments and Questions in the video include:
Acknowledgement of colleagues from DARPA and Galois
Manifolds in machine learning
LLM token space is higher dimensional
Manifold spaces tend to be negatively curved
LLM turn text into vectors
Transformers turn vectors into new text
How do we turn the text into vectors?
We think of LLM as being trained on all human language, but they have not
GPT2 Open source LLM as the source for model
ChatGPT2 used as the example
Tokens have topology and geometry
Words are a categorical variable
Vectors are a numerical variable
Mixing data types can lead to some problems
Why care about the token space?
Not all tokens correspond to a valid vector
Estimating dimensions
Volume of a sphere
Log of Volume vs log of radius curves
Ricci scalar curvature
Stratifications are visible
GPT2 uses a state space that is not a manifold
Dollar sign shown different in GPT2 because the $ is used in code where other currency symbols are not
GPT2’s 768 dimensions unwrapped using tSNE
Tokens with leading spaces
Beginnings of words show up in separate piece of low dimension
Visual similarity to hyperbolic plane
LLEMMA7B dimensions
Plotting dimension
Dark space are non-printing characters
Thinking about how neural activation patterns work
We have been thinking about manifold learning out of mathematical convenience
State spaces are not manifolds
Open presentation to conversation
Date Taken: | 10.11.2024 |
Date Posted: | 10.25.2024 17:07 |
Category: | Video Productions |
Video ID: | 941460 |
VIRIN: | 241025-F-EG995-4282 |
Filename: | DOD_110646971 |
Length: | 01:00:53 |
Location: | US |
Downloads: | 2 |
High-Res. Downloads: | 2 |
This work, Michael Robinson - Topological Features in Large Language Models (and beyond?), by Kenneth M McNulty and Kevin D Schmidt, identified by DVIDS, must comply with the restrictions shown on https://www.dvidshub.net/about/copyright.