Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/lryanle/SMARE
Browse files Browse the repository at this point in the history
  • Loading branch information
lryanle committed Dec 2, 2023
2 parents c3aa66d + 30bc2d5 commit cf2acd7
Show file tree
Hide file tree
Showing 19 changed files with 1,044 additions and 274 deletions.
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -130,4 +130,7 @@ dist
.pnp.*

# misc
*.DS_STORE
*.DS_STORE

# python
*.pyc
33 changes: 33 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,36 @@ Make a copy of the ``.env.example`` file and make the following changes.
2. Paste the username and password provided in MongoDB Atlas (if you should have access but do not, please contact @waseem-polus)

3. Paste the connection URL provided provided in MongoDB Atlas. Include the password and username fields using ``${VARIABLE}`` syntax to embed the value of the variable

## Run Scrapers locally
**Prerequisites**
- python3
- pipenv

**Installing dependencies**
Navigate to ``scrapers/`` and open the virtual environment using
```bash
pipenv shell
```
Then install dependencies using
```bash
pipenv install
```

**Scraper Usage**
To create build a Docker Image use
```bash
pipenv run build
```
to run a docker container "smarecontainer" use
```bash
pipenv run cont
```
then
```bash
# Scrape Craigsist homepage
pipenv run craigslist

# Scrape Facebook Marketplace homepage
pipenv run facebook
```
36 changes: 18 additions & 18 deletions github-metrics.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion metrics.plugin.screenshot.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes.
2 changes: 2 additions & 0 deletions scrapers/.flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[flake8]
max-line-length = 120
25 changes: 25 additions & 0 deletions scrapers/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
FROM public.ecr.aws/lambda/python@sha256:f0c3116a56d167eba8021a5d7c595f969835fbe78826303326f80de00d044733 as build
RUN yum install -y unzip-* && \
curl -Lo "/tmp/chromedriver-linux64.zip" "https://edgedl.me.gvt1.com/edgedl/chrome/chrome-for-testing/119.0.6045.105/linux64/chromedriver-linux64.zip" && \
curl -Lo "/tmp/chrome-linux64.zip" "https://edgedl.me.gvt1.com/edgedl/chrome/chrome-for-testing/119.0.6045.105/linux64/chrome-linux64.zip" && \
unzip /tmp/chromedriver-linux64.zip -d /opt/ && \
unzip /tmp/chrome-linux64.zip -d /opt/ && \
yum clean all

FROM public.ecr.aws/lambda/python@sha256:f0c3116a56d167eba8021a5d7c595f969835fbe78826303326f80de00d044733
RUN yum install atk-* cups-libs-* gtk3-* libXcomposite-* alsa-lib-* \
libXcursor-* libXdamage-* libXext-* libXi-* libXrandr-* libXScrnSaver-* \
libXtst-* pango-* at-spi2-atk-* libXt-* xorg-x11-server-Xvfb-* \
xorg-x11-xauth-* dbus-glib-* dbus-glib-devel-* -y && \
yum clean all
COPY --from=build /opt/chrome-linux64 /opt/chrome
COPY --from=build /opt/chromedriver-linux64 /opt/

WORKDIR /var/task
COPY scrapers.py ./
COPY src ./src
COPY requirements.txt ./

RUN pip install --no-cache-dir -r requirements.txt

CMD [ "scrapers.craigslist" ]
26 changes: 26 additions & 0 deletions scrapers/Pipfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[scripts]
build = "docker build --platform linux/amd64 -t smare ."
cont = "docker run --name smarecontainer -d smare:latest"
exec = "docker exec -it smarecontainer"
craigslist = "pipenv run exec python3 -c 'import scrapers; scrapers.craigslist(\"\",\"\")'"
facebook = "pipenv run exec python3 -c 'import scrapers; scrapers.facebook(\"\",\"\")'"

[packages]
selenium = "*"
bs4 = "*"
pymongo = "*"
typer = "*"
python-dotenv = "*"

[dev-packages]
isort = "*"
black = "*"
flake8 = "*"

[requires]
python_version = "3.11"
Loading

1 comment on commit cf2acd7

@vercel
Copy link

@vercel vercel bot commented on cf2acd7 Dec 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Successfully deployed to the following URLs:

seniordesign – ./

smare.vercel.app
seniordesign-git-main-lryanle.vercel.app
seniordesign-lryanle.vercel.app
smare.lryanle.com

Please sign in to comment.