NEW: AIPRM Live Crawling

aiprm-christophc · March 22, 2023, 10:23pm

AIPRM is getting Live Crawling

ChatGPT only has info until mid-2021
Current/newer events are unknown
Live Crawling can solve this by crawling web pages in real-time
The Live Crawling Feature allows for injecting fresh content from crawled websites.

Here’s a quick teaser video

GradyMedia · March 23, 2023, 1:22am

I guess Im confused here. I just asked GPT 3.5 a similar questions and got a result not much different than the video.
I know the training data is 2021 but how can you tell that it is actually crawling the url versus reading data from 2021?

Ammon · March 23, 2023, 3:46am

I’m 100% in agreement with @GradyMedia on this, in that I need to see that we are getting data that could not possibly have been predicted, and is radically different to that gained any other way.

I know it can be done, you just need to find the right topic to demonstrate that it is being done, and can’t be gained without AIPRM having extracted the data from the URL first to prompt with.

RealityMoez · March 23, 2023, 3:56am

@Ammon

Isn’t it ChatGPT guessing the URL content ? (without the upcoming feature)

Ammon · March 23, 2023, 4:12am

Yes. Without something providing freshly crawled data into a prompt all that ChatGPT can do is predict based on patterns what is statistically the most probable result. But this vide is, I think, meant to be showing AIPRM first taking a URL from the user, then crawling that url, extracting the text from the URL, putting that into a dynamic prompt, and only then passing that prompt, containing the crawled data, over to ChatGPT.

Incidentally, some kinds of URLs are far easier to predict. Anything to do with most standard formats, for example has a standard pattern - press releases, news stories, all then to have very clear, easy to predict patterns.

Take something like a news page about one company acquiring another. If the URL is something that included [Company 1] acquires [Company 2] then we can predict tons of information on that page just from the fact those types of stories have such a clear and standard format.

One company acquires another. The one doing the acquiring is obviously the bigger, because it could afford to buy out company 2.

The page will summarize who company 1 are and what they are famous for, which is likely to be the same stuff they were famous for (or working on) just 2 years ago.

The page will summarize who company 2 are in the same sort of way.

They’ll usually stress the thing that both companies are famous for as a shared interest, since statistically that common interest is why company 1 will have bought our company 2. Especially if it is an interest that is a main focus for company 2, and something that company 1 are less known for but known to be expanding in (2 years ago).

There’s always a statement about how Company 1 expects to expand its capabilities in [insert shared interest that was a primary fame point for Company 2] thanks to the acquisition.

See how that pattern is so predictable?

So to show that this is not just prediction, we need to see details that could not be predicted from data 2 years ago.

RealityMoez · March 23, 2023, 4:16am

Ok, good points, so Live Crawling Feature will solve our concerns, right ?
(by extracting the text from the URL to the [CRAWLEDTEXT] area …)

aiprm-christophc · March 23, 2023, 8:08am

You are all very right.

Even without this feature the inference in the earliest ChatGPT was so good that people STILL think that ChatGPT can crawl.

I was asked about it just yesterday in a message.

I wrote this explanation for it, two months ago

With the AIPRM Live Crawling it will be possible to inject actual web page data, and that will be especially useful combined with the now larger token context in GPT-4.

DaveTucker · July 2, 2023, 9:46am

Chatgpt does not have internet access. However, it does have access to archives till 2021. So whatever data it is extracting from a given URL is not actually due to a prompts ability to live crawl. It is because it is trained on phenomenally huge data, where data from these publicly available URLs, is most likely, included. Therefore, it appears as if it is live crawling. The fact however is, that it is not. Hence, it is misleading to say that the prompt is live crawling.

RealityMoez · July 2, 2023, 9:52am

Live Crawling is solving that bottleneck, AIPRM Live Crawling extracts data from that URL to fed into
the prompt, so ChatGPT doesn’t rely on its 2021 trained data.

Hence, Why it is misleading?

DaveTucker · July 2, 2023, 10:01am

Sorry mate, not quite sure what you mean. Please help understand. Many thanks

RealityMoez · July 2, 2023, 10:20am

Here’s an example:

with this prompt:

Alyahya · August 26, 2023, 12:12pm

ChatGPT can access the interwebs. There is a verified plugin for GPT plus. I’ve used it and it works perfectly. I even crawled a few websites and go seo information.