Daniel Laufer on LinkedIn: In what I expect will be considered a seminal work on language models, a… (2024)

Daniel Laufer

Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

Report this post

In what I expect will be considered a seminal work on language models, a particularly interesting result the authors elucidate is that there exists an upper bound on the number of input tokens provided to a self-attention head for which exclusion from the reachable output set of desired outputs is unavoidable. This has interesting implications in settings where forgetfulness in the middle effects start coming into play; controllability, while achievable, comes not only at greater computational cost, but also with a risk that a self-attention head may select, but subsequently forget, pertinent information necessary to arrive at a desired output.Thanks for sharing David!

1 Comment

Like Comment

David Sauerwein

AI/ML at AWS | PhD in Quantum Physics

Report this comment

Glad you find it useful. I hope this gets picked up by many researchers. I think lots of room to fins interesting results that could work for "post tranformer" models too.

Like Reply

1Reaction

To view or add a comment, sign in

More Relevant Posts

Daniel Laufer

Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

2w
Report this post
My talented better half published her latest podcast episode about sustainable fashion! I'm so proud 💙💛 #fashionpsychology #startup #entrepreneurship

5

Like Comment

To view or add a comment, sign in
Daniel Laufer

Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

3w
Report this post
I'm honoured to have been selected to take part in Orbition Group's Driven by Data Mentorship Program!I look forward to taking part, meeting like minded data professionals and sharing in what will no doubt be some very thought-provoking conversations!

15

2 Comments

Like Comment

To view or add a comment, sign in
Daniel Laufer

Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

3w
Report this post
In the wake of increasingly wide adoption of large language models, you may ask how such models generate their responses? In neural language generation, there turn out to be a number of distinct ways to go from computing the softmax score that a given token from a vocabulary set is the next token, to selecting which token actually is produced. Each of these have trade-offs, typically between how expensive the selection process is computationally, how fast it is, and the quality of the text produced.Thank you Damien for sharing!

1

Like Comment

To view or add a comment, sign in
Daniel Laufer

Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

3w
Report this post
In the realm of language modeling, NuMind's approach to building task-specialized foundation models for structured entity extraction stands out. Their focus on incremental improvement (as evidenced by their work on the NuNER model family for named entity extraction tasks), and quality in creating performant, yet compact models (with their largest model comparable to GPT-4o with roughly 1/10 fewer parameters) is truly commendable. NuMind's latest blog post on NuExtract, their foundation model for structured extraction, offers valuable insights into their approach to designing a language model from first principles. Dive into the details here:#NuMind https://lnkd.in/eWfE9Zk9

11

Like Comment

To view or add a comment, sign in
Daniel Laufer

Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

1mo
Report this post
A lovely idea brought to fruition by lovely people! I'm looking forward to it Kinsey and Dan!

3

2 Comments

Like Comment

To view or add a comment, sign in
Daniel Laufer

Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

2mo
Report this post
My multi-talented better half is going to be on the telly! Please listen in to what's going to be a lovely discussion on the social impacts of clothing May 23rd, 7 pm ECT!

3

Like Comment

To view or add a comment, sign in
Daniel Laufer

Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

2mo
Report this post
Indeed a very important caveat to AI systems: if you haven't evaluated the efficacy of your model outputs, it's ill-advised to place trust in them. Thank you for keeping us honest Chip Huyen!

4

Like Comment

To view or add a comment, sign in

2,050 followers

41 Posts

View Profile

Explore topics

Sales
Marketing
Business Administration
HR Management
Content Management
Engineering
Soft Skills
See All

Daniel Laufer on LinkedIn: In what I expect will be considered a seminal work on language models, a… (2024)

More Relevant Posts

Explore topics

References