DVIDS Hub works best with JavaScript enabled

Michael Robinson - Topological Features in Large Language Models (and beyond?)

UNITED STATES

10.11.2024

Video by Kenneth M McNulty and Kevin D Schmidt

Air Force Research Laboratory

In this edition of QuEST, Michael Robinson will discuss topological features in large language models

Key Moments and Questions in the video include:
Acknowledgement of colleagues from DARPA and Galois
Manifolds in machine learning
LLM token space is higher dimensional
Manifold spaces tend to be negatively curved
LLM turn text into vectors
Transformers turn vectors into new text
How do we turn the text into vectors?
We think of LLM as being trained on all human language, but they have not
GPT2 Open source LLM as the source for model
ChatGPT2 used as the example
Tokens have topology and geometry
Words are a categorical variable
Vectors are a numerical variable
Mixing data types can lead to some problems
Why care about the token space?
Not all tokens correspond to a valid vector
Estimating dimensions
Volume of a sphere
Log of Volume vs log of radius curves
Ricci scalar curvature
Stratifications are visible
GPT2 uses a state space that is not a manifold
Dollar sign shown different in GPT2 because the $ is used in code where other currency symbols are not
GPT2’s 768 dimensions unwrapped using tSNE
Tokens with leading spaces
Beginnings of words show up in separate piece of low dimension
Visual similarity to hyperbolic plane
LLEMMA7B dimensions
Plotting dimension
Dark space are non-printing characters
Thinking about how neural activation patterns work
We have been thinking about manifold learning out of mathematical convenience
State spaces are not manifolds
Open presentation to conversation

VIDEO INFO

Date Taken:	10.11.2024
Date Posted:	10.25.2024 17:07
Category:	Video Productions
Video ID:	941460
VIRIN:	241025-F-EG995-4282
Filename:	DOD_110646971
Length:	01:00:53
Location:	US

Video Analytics

Downloads:	3
High-Res. Downloads:	3

PUBLIC DOMAIN

This work, Michael Robinson - Topological Features in Large Language Models (and beyond?), by Kenneth M McNulty and Kevin D Schmidt, identified by DVIDS, must comply with the restrictions shown on https://www.dvidshub.net/about/copyright.

CONTROLLED VOCABULARY KEYWORDS

No keywords found.

OPTIONS

Version: 6b23f950290cd3a60ce9941b048b580826958d54_2025-04-21T09:29:47

Michael Robinson - Topological Features in Large Language Models (and beyond?)

UNITED STATES

10.11.2024

Video by Kenneth M McNulty and Kevin D Schmidt

Air Force Research Laboratory

VIDEO INFO

Video Analytics

PUBLIC DOMAIN

MORE LIKE THIS

CONTROLLED VOCABULARY KEYWORDS

TAGS

OPTIONS

DVIDS Control Center

Web Support

Customer Service

Features

Units

Newswire

About DVIDS

Media Requests

Hometown News

Stories

Content

Links

Michael Robinson - Topological Features in Large Language Models (and beyond?)

UNITED STATES

10.11.2024

Video by Kenneth M McNulty and Kevin D Schmidt

VIDEO INFO

Video Analytics

PUBLIC DOMAIN

MORE LIKE THIS

CONTROLLED VOCABULARY KEYWORDS

TAGS

OPTIONS

Flag Asset

Michael Robinson - Topological Features in Large Language Models (and beyond?)

DVIDS Control Center

Web Support

Customer Service

Links