Skip to content

brazilian wildfire analysis based on Apache Spark + Apache Kudu + Jupyter Notebook

Notifications You must be signed in to change notification settings

arthursimas1/wildfire-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Wildfire Analysis

Brazilian wildfire analysis based on Apache Spark + Apache Kudu + Jupyter Notebook

Setup and Run

git clone --branch latest https://github.com/arthursimas1/wildfire-analysis.git
cd wildfire-analysis
docker-compose up -d

Enter the impala-shell terminal, connect to the Kudu master and create the table.

docker exec -it impala impala-shell
connect;   -- (run again if not successful)

CREATE TABLE queimada (
  datahora STRING,
  latitude DECIMAL(8, 5),
  longitude DECIMAL(8, 5),
  ano INT,
  mes INT,
  satelite STRING,
  pais STRING,
  estado STRING,
  municipio STRING,
  bioma STRING,
  diasemchuva DECIMAL(10, 5),
  precipitacao DECIMAL(10, 5),
  riscofogo DECIMAL(10, 5),
  frp DECIMAL(10, 5),
  PRIMARY KEY(datahora, latitude, longitude)
)
STORED AS KUDU;

CREATE TABLE queimadas_bioma_mes (
  bioma STRING,
  ano INT,
  mes INT,
  count BIGINT,
  PRIMARY KEY(bioma, ano, mes)
)
STORED AS KUDU;

CREATE TABLE frp_estado_mes (
  estado STRING,
  ano INT,
  mes INT,
  max_frp DECIMAL(10, 5),
  PRIMARY KEY(estado, ano, mes)
)
STORED AS KUDU;

exit;

💡 You can stop impala as we only use it for creating the table.

docker stop impala

Get the JupyterLab link.

docker logs pyspark-notebook

Access the JupyterLab, upload the notebook and the data.zip. Run.

Then you can stop the running containers:

docker-compose stop

Or remove them:

docker-compose rm

About

brazilian wildfire analysis based on Apache Spark + Apache Kudu + Jupyter Notebook

Resources

Stars

Watchers

Forks

Packages

No packages published