Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrapy source using scrapy #332

Merged
merged 107 commits into from
Mar 5, 2024
Merged

Scrapy source using scrapy #332

merged 107 commits into from
Mar 5, 2024

Commits on Feb 26, 2024

  1. Scrapy source using scrapy

    Close queue
    
    Add requirements.txt
    
    Format code
    
    Cleanup code
    
    Fix linting issues
    
    Remove redundant config option
    
    Add revised README
    
    Add more docstring and cleanup sources
    
    Make api simpler
    
    Cleanup
    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    4d28bd9 View commit details
    Browse the repository at this point in the history
  2. Add batching of results

    Add logging and batch size configuration
    
    Cleanup code
    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    ac63f7d View commit details
    Browse the repository at this point in the history
  3. Add pytest-mock and scrapy

    Add tests
    
    Adjust assert
    
    Update README
    
    Add missing diagram file
    
    Update README
    
    Close queue when exiting
    
    Check if queue close is called
    
    Log number of batches
    
    Fix linting issues
    
    Fix linting issues
    
    Mark scrapy source
    
    Fix linting issue
    
    Format code
    
    Yield!
    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    59fd5f9 View commit details
    Browse the repository at this point in the history
  4. Adjust tests

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    d80446c View commit details
    Browse the repository at this point in the history
  5. Add pytest-twisted

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    04d8841 View commit details
    Browse the repository at this point in the history
  6. Add twisted to scrapy dependencies

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    2741da8 View commit details
    Browse the repository at this point in the history
  7. Add twisted to dev dependencies

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    a685dae View commit details
    Browse the repository at this point in the history
  8. Add review comments

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    78bcc6f View commit details
    Browse the repository at this point in the history
  9. Add more checks and do not exit when queue is empty

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    3208ea8 View commit details
    Browse the repository at this point in the history
  10. Create QueueClosedError and handle in listener to exit loop

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    167fffa View commit details
    Browse the repository at this point in the history
  11. Simplify code

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    93ed13a View commit details
    Browse the repository at this point in the history
  12. Stop crawling if queue is closed

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    2756610 View commit details
    Browse the repository at this point in the history
  13. Fix linting issues

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    1d73c9b View commit details
    Browse the repository at this point in the history
  14. Fix linting issues

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    61fd907 View commit details
    Browse the repository at this point in the history
  15. Adjust tests and disable telnet server for scrapy

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    20e10a4 View commit details
    Browse the repository at this point in the history
  16. Remove pytest-twisted

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    b3bf863 View commit details
    Browse the repository at this point in the history
  17. Refactor scrapy item pipeline

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    4812fcc View commit details
    Browse the repository at this point in the history
  18. Eliminate custom spider

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    f79720f View commit details
    Browse the repository at this point in the history
  19. Rename a function

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    4c727cd View commit details
    Browse the repository at this point in the history
  20. Simplify code

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    8b8b417 View commit details
    Browse the repository at this point in the history
  21. Cleanup code

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    a9193e8 View commit details
    Browse the repository at this point in the history
  22. Update comment

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    fc66d97 View commit details
    Browse the repository at this point in the history
  23. Update comment

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    a822218 View commit details
    Browse the repository at this point in the history
  24. Fix linting issues

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    6d45468 View commit details
    Browse the repository at this point in the history
  25. Define abstract method

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    7799bf1 View commit details
    Browse the repository at this point in the history
  26. Update readme

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    059837f View commit details
    Browse the repository at this point in the history
  27. Add more tests

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    ba52057 View commit details
    Browse the repository at this point in the history
  28. Adjust tests

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    82963d0 View commit details
    Browse the repository at this point in the history
  29. Use pytest.mark.forked to run tests for ALL_DESTINATIONS

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    ba04471 View commit details
    Browse the repository at this point in the history
  30. Add pytest-forked

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    2f4a378 View commit details
    Browse the repository at this point in the history
  31. Update lockfile

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    07a140d View commit details
    Browse the repository at this point in the history
  32. Use scrapy signals

    Sultan Iman authored and sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    c41fad2 View commit details
    Browse the repository at this point in the history
  33. Configuration menu
    Copy the full SHA
    ade0069 View commit details
    Browse the repository at this point in the history
  34. Add more types

    sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    7296324 View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    7f44d14 View commit details
    Browse the repository at this point in the history
  36. Configuration menu
    Copy the full SHA
    7785e56 View commit details
    Browse the repository at this point in the history
  37. Simplify helpers code

    sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    83ec743 View commit details
    Browse the repository at this point in the history
  38. Cleanup code

    sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    aadd6e4 View commit details
    Browse the repository at this point in the history
  39. Configuration menu
    Copy the full SHA
    14a18a1 View commit details
    Browse the repository at this point in the history
  40. Configuration menu
    Copy the full SHA
    78f8777 View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    0e0d5ff View commit details
    Browse the repository at this point in the history
  42. Adjust config file

    sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    a2733bd View commit details
    Browse the repository at this point in the history
  43. Configuration menu
    Copy the full SHA
    d8371de View commit details
    Browse the repository at this point in the history
  44. Configuration menu
    Copy the full SHA
    47d24f5 View commit details
    Browse the repository at this point in the history
  45. Configuration menu
    Copy the full SHA
    505acff View commit details
    Browse the repository at this point in the history
  46. Configuration menu
    Copy the full SHA
    de154bb View commit details
    Browse the repository at this point in the history
  47. Configuration menu
    Copy the full SHA
    20358cd View commit details
    Browse the repository at this point in the history
  48. Configuration menu
    Copy the full SHA
    f8fc527 View commit details
    Browse the repository at this point in the history
  49. Adjust batch size

    sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    9024f6b View commit details
    Browse the repository at this point in the history
  50. Fix queue batching bugs

    sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    9cf8ec1 View commit details
    Browse the repository at this point in the history
  51. Configuration menu
    Copy the full SHA
    7378e75 View commit details
    Browse the repository at this point in the history
  52. Configuration menu
    Copy the full SHA
    4f5bcb8 View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    657714a View commit details
    Browse the repository at this point in the history
  54. Rewrite tests

    sultaniman committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    e030b51 View commit details
    Browse the repository at this point in the history
  55. Configuration menu
    Copy the full SHA
    d38ef2a View commit details
    Browse the repository at this point in the history

Commits on Feb 27, 2024

  1. Adjust queue read timeout

    sultaniman committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    e7ca332 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a4d9290 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ee8f3cd View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    2e15f07 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    fc4a244 View commit details
    Browse the repository at this point in the history
  6. Cleanup scraping helpers

    sultaniman committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    7334a6a View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    398d732 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    f7347b1 View commit details
    Browse the repository at this point in the history
  9. Update readme

    sultaniman committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    9139c00 View commit details
    Browse the repository at this point in the history
  10. Cleanup code

    sultaniman committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    f9affb2 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    e9a38e1 View commit details
    Browse the repository at this point in the history
  12. Fix linting issues

    sultaniman committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    76ddfbc View commit details
    Browse the repository at this point in the history
  13. Fix linting issues

    sultaniman committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    6ccb247 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    1629a15 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    f512b1d View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    36e55c5 View commit details
    Browse the repository at this point in the history
  17. Use proper Union syntax

    sultaniman committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    dbd0d53 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    c9da574 View commit details
    Browse the repository at this point in the history
  19. Use latest dlt version

    sultaniman committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    3cf9b23 View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    e5d34c4 View commit details
    Browse the repository at this point in the history

Commits on Feb 28, 2024

  1. Configuration menu
    Copy the full SHA
    84767fb View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6431b62 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    9352f9d View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    f43d602 View commit details
    Browse the repository at this point in the history
  5. Update test skip reason

    sultaniman committed Feb 28, 2024
    Configuration menu
    Copy the full SHA
    7eba014 View commit details
    Browse the repository at this point in the history

Commits on Mar 1, 2024

  1. Stop crawler manually

    sultaniman committed Mar 1, 2024
    Configuration menu
    Copy the full SHA
    b5f0f06 View commit details
    Browse the repository at this point in the history
  2. Return self from __call__

    sultaniman committed Mar 1, 2024
    Configuration menu
    Copy the full SHA
    2ef1350 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2c4fd60 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    3ae0f1e View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    fb9ddc6 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    5a881f1 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    9376720 View commit details
    Browse the repository at this point in the history
  8. Update readme

    sultaniman committed Mar 1, 2024
    Configuration menu
    Copy the full SHA
    b03d564 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    c03e6ab View commit details
    Browse the repository at this point in the history
  10. Adjust tests

    sultaniman committed Mar 1, 2024
    Configuration menu
    Copy the full SHA
    2b4d64a View commit details
    Browse the repository at this point in the history

Commits on Mar 4, 2024

  1. Configuration menu
    Copy the full SHA
    cad5924 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b95c9fe View commit details
    Browse the repository at this point in the history
  3. Update lockfile

    sultaniman committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    cb9a8a8 View commit details
    Browse the repository at this point in the history
  4. Fix linting issues

    sultaniman committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    49847a8 View commit details
    Browse the repository at this point in the history
  5. Use simple run_pipeline

    sultaniman committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    65a5019 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    7bfdc80 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    d490620 View commit details
    Browse the repository at this point in the history
  8. Update comments

    sultaniman committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    8667536 View commit details
    Browse the repository at this point in the history

Commits on Mar 5, 2024

  1. Configuration menu
    Copy the full SHA
    7cadb93 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    02e467b View commit details
    Browse the repository at this point in the history
  3. Format code

    sultaniman committed Mar 5, 2024
    Configuration menu
    Copy the full SHA
    ded3949 View commit details
    Browse the repository at this point in the history
  4. Debug test queue

    sultaniman committed Mar 5, 2024
    Configuration menu
    Copy the full SHA
    56e8a54 View commit details
    Browse the repository at this point in the history
  5. Adjust mock patch path

    sultaniman committed Mar 5, 2024
    Configuration menu
    Copy the full SHA
    2e587b9 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    dd9d3f5 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    ed33d15 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    9ef03bf View commit details
    Browse the repository at this point in the history
  9. Skip test

    sultaniman committed Mar 5, 2024
    Configuration menu
    Copy the full SHA
    eb8278c View commit details
    Browse the repository at this point in the history