How do I pass a cookie from an actor to a crawler

#1

I have a working actor, logging into a site. I can see the page I want to crawl.
If I ‘import’ my cookie manually into the crawler from EditThisCookie, the crawler works.
I want to use the actor and crawler inside a scheduled job.
How to I pass the cookie from the actor to the crawler?

#2

Hi,

If you’re starting the crawler from act using the Apify API client, you can pass the cookies like this:

const execution = await apifyClient.crawlers.startExecution({
  crawlerId: 'v6hb9olk86gfd8',
  settings: {
      cookies: [
          {
              domain: 'some_domain',
              expires: 'some_date',
              ...
          },
          ...
      ]
  }
});
#3

Thanks @petr-cermak.

I’m starting the crawler from the scheduler on the apify website. I’m not using the API.

#4

Hi @Brett_Lewis

It isn’t a problem, you can just start Actor act instead of crawler by the scheduler and in the act you can start crawler as @petr-cermak recommended you.

// Get cookies after login
const myCookies = await page.cookies();

// call crawler with cookies
const execution = await apifyClient.crawlers.startExecution({
  crawlerId: 'v6hb9olk86gfd8',
  settings: {
      cookies: myCookies
  }
});

So your workflow should look like:
scheduler -> Actor act (grab cookies and start crawler) -> crawler (started with cookies)

Let me know if it helps.

#5

Many thanks for the help.

I’ve tried that code and I get an error…

2018-03-26T19:56:22.058Z User function threw an exception:
2018-03-26T19:56:22.060Z TypeError: Cannot read property ‘startExecution’ of undefined
2018-03-26T19:56:22.068Z at Apify.main (/home/myuser/main.js:48:46)
2018-03-26T19:56:22.069Z at

Apologies - I’m no coder (I’m sure that’s obvious by now) - I’m cobbling together code - but am pretty clueless. :frowning:

#6

Hi Brett,

no problem you dont have assign apifyClient, just update your code like this:

const apifyClient = Apify.client;
const execution = await apifyClient.crawlers.startExecution({
  crawlerId: 'v6hb9olk86gfd8',
  settings: {
      cookies: myCookies
  }
});

There are doc.

Best,
Jakub D.

#7

Many thanks for the help Jacub.

I’m still getting the error. I suspect I may have a syntax error somewhere. I’ll refer to the docs and will try to sort this out.

#8

I would like to ask for help.

I setup a crawler with URLs I want to crawl, actor is working, I tested with cookie/screenshot example. I just have problem with passing cookie from actor to crawl.

const Apify = require('apify');

Apify.main(async () => {
    const input = await Apify.getValue('INPUT');

    const browser = await Apify.launchPuppeteer();
    const page = await browser.newPage();
    await page.goto('http://xy.com/login');

    // Login
    await page.type('#form_user_login_email', input.username);
    await page.type('#form_user_login_password', input.password);
    await page.evaluate(() => { document.querySelectorAll('.btn-full-width')[1].click(); });
    await page.waitForNavigation();

    // Get cookies
    const cookies = await page.cookies();

    // Use cookies in other tab or browser
    //const page2 = await browser.newPage();
    //await page2.setCookie(...cookies);
    // Get cookies after login
  
    const apifyClient = Apify.client;
    // call crawler with cookies
    const execution = await apifyClient.crawlers.startExecution({
    crawlerId: 'mhi',
    settings: {
      cookies: cookies
    }
    });

    console.log('Done.');
    
    console.log('Closing Puppeteer...');
    await browser.close();

});

I think cookie is not passed, because Crawler is not logged in.

#9

Hi @adriankoooo,

I answered to you on stackoverflow post.

#10

Hi @drobnikj.

Unfortunately not helped.

In meantime I contacted support on apify. They said this is probably a bug, but it will be not fixed for crawler, because it is deprecated.

Instead they recommended to use new Web Scraper actor. Here I can pass initialCookies in main object. I tried everything, but I don’t know how to do exactly this.

I am not a professional player here, but I think they mean:

actor “my-actor” (login, get cookies, run and pass cookies to “web-scraper” task) -> “web-scraper” task (gets cookies from actor, scrap web pages).

So I tried the following code:

const Apify = require('apify');

Apify.main(async () => {
    const input = await Apify.getValue('INPUT');

    const browser = await Apify.launchPuppeteer();
    const page = await browser.newPage();
    await page.goto('http://yo.com/prihlaseni');

    // Login
    await page.type('#form_user_login_email', "@");
    await page.type('#form_user_login_password', "pwd");
    await page.evaluate(() => { document.querySelectorAll('.btn-full-width')[1].click(); });
    //await page.click('.btn-full-width');
    await page.waitForNavigation();

    // Get cookies
    const myCookies = await page.cookies();
    console.log(myCookies);

    const ApifyClient = require('apify-client');

    const apifyClient = new ApifyClient({
    userId: 'cEA',
    token: 'TE',
    });

    apifyClient.tasks.runTask({
        taskId: 'u64',
        body: {
            "initialCookies": myCookies 
        }
    });

2019-05-11T18:00:33.639Z ERROR: The function passed to Apify.main() threw an exception: (error details: type=invalid-parameter)

Maybe I am close to success, just don’t know how to properly send cookies as input configuration for web-scraper actor.

#11

Thanks to Lukas from apify who showed me that Apify.client was not necessary:

const taskInput = {
initialCookies: myCookies
}

console.log(‘I’m the really new run’);
const task = await Apify.callTask(‘taskIDsssososososo’, taskInput);
console.dir(task);

1 Like