Proxies working with vanilla requests, but get a 403 error with Crawlee

I'm trying to scrape a site that needs geotargeting for Colombian IP addresses, it does return big response size with vanilla requests with no headers. But with Crawlee, it always just returns a 403 error. Here is the code. I left `proxy_url` blank to not expose my credentials, you would need to put a proxy link there that is targeting Colombian IPs.
```
import requests
from crawlee.router import Router
from crawlee.crawlers import BeautifulSoupCrawler, BeautifulSoupCrawlingContext
from crawlee import ConcurrencySettings, Request
from crawlee.sessions import SessionPool
from crawlee.proxy_configuration import ProxyConfiguration
import asyncio
from datetime import timedelta


async def main():
    # define links
    target_url = "https://www.exito.com/" # target site
    proxy_url = "" # left blank to not expose my credentials

    # first do vanilla request
    r = requests.get(target_url, timeout=30, proxies={"http":proxy_url, "https":proxy_url})
    print("Content length with vanilla requests:", len(r.text))


    # define router
    router = Router[BeautifulSoupCrawlingContext]()
    @router.handler("MAIN")
    async def main_handler(context : BeautifulSoupCrawlingContext) -> None:
        print("inside handler")
        response = await context.http_response.read()
        print(response[0:50])
        return
    
    # then try to do a request with crawlee with the same proxy
    crawler = BeautifulSoupCrawler(request_handler=router,
                                   concurrency_settings=ConcurrencySettings(desired_concurrency=1, max_concurrency=1),
                                   max_request_retries=15,
                                   session_pool=SessionPool(max_pool_size=1,
                                                            create_session_settings={'max_usage_count': 999999999, "max_age":timedelta(hours=999999), 'max_error_score': 100000}),
                                   proxy_configuration=ProxyConfiguration(proxy_urls=[proxy_url]) )

    # run it
    await crawler.run( [Request.from_url(target_url, label='MAIN')] )

    
    

if __name__ == '__main__':
    asyncio.run(main())

```

I understand this maybe could be tricky to recreate, so I will post also an image of what it prints out when ran. You can see that with vanilla requests it works, this is what always happens when I run it. 

<img width="1522" height="933" alt="Image" src="https://github.com/user-attachments/assets/e6f1b228-7bed-4ed1-8500-cae6dd70beeb" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proxies working with vanilla requests, but get a 403 error with Crawlee #1683

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proxies working with vanilla requests, but get a 403 error with Crawlee #1683

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions