Daniel Laufer on LinkedIn: In what I expect will be considered a seminal work on language models, a… (2024)

Daniel Laufer

Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

  • Report this post

In what I expect will be considered a seminal work on language models, a particularly interesting result the authors elucidate is that there exists an upper bound on the number of input tokens provided to a self-attention head for which exclusion from the reachable output set of desired outputs is unavoidable. This has interesting implications in settings where forgetfulness in the middle effects start coming into play; controllability, while achievable, comes not only at greater computational cost, but also with a risk that a self-attention head may select, but subsequently forget, pertinent information necessary to arrive at a desired output.Thanks for sharing David!

9

1 Comment

Like Comment

David Sauerwein

AI/ML at AWS | PhD in Quantum Physics

4d

  • Report this comment

Glad you find it useful. I hope this gets picked up by many researchers. I think lots of room to fins interesting results that could work for "post tranformer" models too.

Like Reply

1Reaction

To view or add a comment, sign in

More Relevant Posts

  • Daniel Laufer

    Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

    • Report this post

    My talented better half published her latest podcast episode about sustainable fashion! I'm so proud 💙💛 #fashionpsychology #startup #entrepreneurship

    5

    Like Comment

    To view or add a comment, sign in

  • Daniel Laufer

    Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

    • Report this post

    I'm honoured to have been selected to take part in Orbition Group's Driven by Data Mentorship Program!I look forward to taking part, meeting like minded data professionals and sharing in what will no doubt be some very thought-provoking conversations!

    15

    2 Comments

    Like Comment

    To view or add a comment, sign in

  • Daniel Laufer

    Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

    • Report this post

    In the wake of increasingly wide adoption of large language models, you may ask how such models generate their responses? In neural language generation, there turn out to be a number of distinct ways to go from computing the softmax score that a given token from a vocabulary set is the next token, to selecting which token actually is produced. Each of these have trade-offs, typically between how expensive the selection process is computationally, how fast it is, and the quality of the text produced.Thank you Damien for sharing!

    1

    Like Comment

    To view or add a comment, sign in

  • Daniel Laufer

    Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

    • Report this post

    In the realm of language modeling, NuMind's approach to building task-specialized foundation models for structured entity extraction stands out. Their focus on incremental improvement (as evidenced by their work on the NuNER model family for named entity extraction tasks), and quality in creating performant, yet compact models (with their largest model comparable to GPT-4o with roughly 1/10 fewer parameters) is truly commendable. NuMind's latest blog post on NuExtract, their foundation model for structured extraction, offers valuable insights into their approach to designing a language model from first principles. Dive into the details here:#NuMind https://lnkd.in/eWfE9Zk9

    11

    Like Comment

    To view or add a comment, sign in

  • Daniel Laufer

    Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

    • Report this post

    A lovely idea brought to fruition by lovely people! I'm looking forward to it Kinsey and Dan!

    3

    2 Comments

    Like Comment

    To view or add a comment, sign in

  • Daniel Laufer

    Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

    • Report this post

    My multi-talented better half is going to be on the telly! Please listen in to what's going to be a lovely discussion on the social impacts of clothing May 23rd, 7 pm ECT!

    3

    Like Comment

    To view or add a comment, sign in

  • Daniel Laufer

    Physicist | Senior Consultant @ EY | NLP and People Data Scientist | Machine Learning and Software Engineer

    • Report this post

    Indeed a very important caveat to AI systems: if you haven't evaluated the efficacy of your model outputs, it's ill-advised to place trust in them. Thank you for keeping us honest Chip Huyen!

    4

    Like Comment

    To view or add a comment, sign in

Daniel Laufer on LinkedIn: In what I expect will be considered a seminal work on language models, a… (26)

Daniel Laufer on LinkedIn: In what I expect will be considered a seminal work on language models, a… (27)

2,050 followers

  • 41 Posts

View Profile

Follow

Explore topics

  • Sales
  • Marketing
  • Business Administration
  • HR Management
  • Content Management
  • Engineering
  • Soft Skills
  • See All
Daniel Laufer on LinkedIn: In what I expect will be considered a seminal work on language models, a… (2024)

References

Top Articles
Latest Posts
Article information

Author: Prof. An Powlowski

Last Updated:

Views: 5516

Rating: 4.3 / 5 (64 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Prof. An Powlowski

Birthday: 1992-09-29

Address: Apt. 994 8891 Orval Hill, Brittnyburgh, AZ 41023-0398

Phone: +26417467956738

Job: District Marketing Strategist

Hobby: Embroidery, Bodybuilding, Motor sports, Amateur radio, Wood carving, Whittling, Air sports

Introduction: My name is Prof. An Powlowski, I am a charming, helpful, attractive, good, graceful, thoughtful, vast person who loves writing and wants to share my knowledge and understanding with you.