How do I parse a JSON response within interceptRequestData?


#1

Hi,

I enqueue pages which return JSON, and need to parse this JSON. Inspired by this topic and this topic I figured I need to parse it within interceptRequestData :

context.enqueuePage({
    url: 'https://example.org',
    label: 'overview',
    interceptRequestData: {
        id: thisid,
        name: ???
    }
});

However I’m not sure how to create a JSON object from there. I tried to replace ??? by $(this).name but it’s empty. I also tried to set it to JSON.parse($('body pre').text()).name but it returns

SyntaxError: JSON Parse error: Unexpected EOF

and anyway since I actually have a few fields to parse I believe there must be a more efficient way.

Any hints ?


#2

Hello,

If I understand this correctly, you want to enqueue a page, crawl it and output some properties of the JSON it returns.

If that’s correct, you don’t need to do anything special to enqueue the page. You just enqueue it like any other. Then, in the pageFunction, when you receive the page’s data (by checking for its label, in your case, overview) you would use the JSON.parse($('body pre').text()); to extract the JSON.

There are obviously more efficient ways how to do this, but that would require you to use our Actor product instead of Crawler.


#3

Thanks, for some reason I thought I had to use interceptRequestData, but indeed an if/then on context.request.label does the job.