Categorizing text using vector based embeddings



AI Summary

  • Overview of an embeddings categorizer:
    • Matches text to one of 20 categories using embeddings (t5x embeddings 3 large).
    • Returns top three category matches for input text.
  • Examples of categorization:
    • “Hawaii” Travel, History, Education.
    • ”I’m studying to become a mathematician” Education, Finance, Science.
    • ”flowers” Fact, Art, Travel.
    • ”sport” Sport, Travel, Politics.
    • ”Mark Twain” Travel, Literature, History.
    • ”Hemingway” Literature, Environment, History.
    • ”Roman Empire” History, Real Estate, Politics.
    • ”brain” Psychology, Science, Art.
    • ”Picasso” Art, Food, Psychology.
    • ”meteor” Science, Travel, Art.
  • Functionality and usage:
    • Can categorize text, chunks of information, or route user intentions.
    • Utilizes a CategoryMatcherApp class and GPTCalls class for API calls to OpenAI.
    • Initializes with 20 categories and pre-computes embeddings for each.
    • User input generates embeddings and finds closest categories based on similarity.
    • Can return a specified number of top matches (default is three).
    • Demonstrates a proof of concept for potential automation in future projects.
  • Additional offerings:
    • Code available on Patreon.
    • Over 200 projects accessible for supporters.
    • Mention of other tools and resources:
      • Chat applications at cod.app with 900+ free GPT-powered chat apps.
      • AutoStreamer app for building live websites and content creation.
      • Live stream tutorials on deploying websites.
  • Pricing and support:
    • Download all chat applications for $100 on Patreon.
    • AutoStreamer app available for $200 on Patreon.
    • Acknowledgment of viewer support and anticipation for the next video.