Extracting unstructured text and images into database tables with GPT-4 Turbo and Datasette Extract


Summary

The video showcases a new Datasette plugin leveraging GPT-4 APIs to extract structured data from a jazz venue's webpage for an events calendar in Half Moon Bay. The plugin defines a table 'events' with columns for event date, description, venue name, and start time, formatted as yyyy month month day day. It successfully imports five events, including images, into the table, emphasizing accuracy through review and corrections in the database.


Introduction to GPT-4 Datasette Plugin

Demonstration of a new Datasette plugin that utilizes GPT-4 APIs to extract structured data into tables for building an events calendar for Half Moon Bay.

Selecting Data from Page

Selecting data from the page of the Bach Dancing and Dynamite Society, a local jazz venue.

Defining the Table

Defining the table named 'events' with columns for event date, description, venue name, and start time.

Formatting Date

Formatting the date as yyyy month month day day, based on the calendar view.

Finalizing Instructions

Adding additional instructions to complete the table definition.

Passing Data to GPT-4

Passing the structured data to GPT-4 for import, which successfully imports five events.

Adding Images to Events

Incorporating images into the events table and testing the import with an image.

Reviewing Import

Reviewing the imported data in the database table and making corrections for accuracy.

Logo

Get your own AI Agent Today

Thousands of businesses worldwide are using Chaindesk Generative AI platform.
Don't get left behind - start building your own custom AI chatbot now!