Skip to content

Read pcap and har files and assemble HTTP requests

License

Notifications You must be signed in to change notification settings

rambler-digital-solutions/pcaper

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pcaper

https://travis-ci.org/travis-ci/travis-web.svg?branch=master

The package helps to assemble and iterate HTTP requests. Pcaper provides class to read traffic files in pcap or har formats, executable converters - pcap2txt and har2txt. PcapParser based on dpkt. HarParser uses built-in json package.

pcaper extends dpkt.http.Request class. Following fields of HTTP request are available:

  • timestamp - timestamp of the last packet of original HTTP request
  • src - source IP address
  • dst - destination IP address
  • sport - source TCP port
  • dport - destination TCP port
  • method - HTTP request method
  • version - HTTP protocol version
  • uri - HTTP request URI
  • headers - ordered dictionary of HTTP headers
  • origin_headers - ordered dictionary HTTP headers with case sensetive names
  • body - HTTP request body
  • origin - original HTTP request

Installation

pip install pcaper

Import

import pcaper
pcap_parser = pcaper.PcapParser()
har_parser = pcaper.HarParser()

Examples

Iterate HTTP requests

Read pcap file, assemble and iterate HTTP requests

from pcaper import PcapParser

pcap_parser = PcapParser()
params = {
    'input': 'file.pcap',
}
for request in pcap_parser.read_pcap(params):
    print(request.origin)
from pcaper import HarParser

har_parser = HarParser()
params = {
    'input': 'file.har'
}
for request in har_parser.read_har(params):
    print(request.origin)

Extract separate HTTP request headers

You can extract header by name

reader = pcaper.PcapParser()
params = {
    'input': 'file.pcap'
}
for request in reader.read_pcap(params):
    print(request.headers['host'])
    print(request.headers['user-agent'])

Filter TCP/IP packets

It is possible to filter out excess packets

reader = pcaper.PcapParser()
params = {
    'input': 'file.pcap',
    'filter': 'tcp.dst == 1.1.1.1'
}
for request in reader.read_pcap(params):
    print(request.origin)

You can combine tcp and ip filters in dpkt style

reader = pcaper.PcapParser()
params = {
    'input': 'file.pcap',
    'filter': '(ip.src == 10.4.0.136 or ip.dst == 10.1.40.61) and tcp.dport == 8888'
}
for request in reader.read_pcap(params):
    print(request.origin)

It is possible to use excluding filter in dpkt style

reader = pcaper.PcapParser()
params = {
    'input': 'file.pcap',
    'filter': 'tcp.dport != 8888 and ip.dst != 10.1.40.61'
}
for request in reader.read_pcap(params):
    print(request.origin)

Note

New pcapng format is not supported by dpkt package, but you can convert input file from pcapng to pcap format with standard utility, which is installed with wireshark package.

mergecap file.pcapng -w out.pcap -F pcap

Scripts

pcap2txt

The pcap2txt script is installed to Python directory and can be executed directly in command line

It simplify parsing of pcap files. Just extract HTTP requests including its headers and body and print out complete data to console or file.

Print HTTP requests from pcap file:

pcap2txt file.pcap

Filter TCP/IP packets, extract HTTP requests and write to external file:

pcap2txt -f "tcp.dport == 8080 and ip.dst != 10.10.10.10" -o file.out file.pcap

Filter HTTP packets

pcap2txt -F '"rambler.ru" in http.uri' file.pcap

You can use logical expressions in filters

pcap2txt -F '"keep-alive" in http.headers["connection"] or "Keep-alive" in http.headers["connection"]' file.pcap

Standard Python string functions over HTTP request headers

pcap2txt -F '"keep-alive" in http.headers["connection"].lower()' file.pcap

Use excluding filters also

pcap2ammo -F '"rambler.ru" not in http.uri' file.pcap

Print statistics about counted requests:

pcap2txt -f "ip.src == 10.10.10.10" -S file.pcap

Stats:
    total: 1
    complete: 1
    incorrect: 0
    incomplete: 0

har2txt

The har2txt script is installed to Python directory and can be executed directly in command line

It simplify parsing of har files. Just extract HTTP requests including its headers and body and print out complete data to console or file.

Print HTTP requests from har file:

har2txt file.har

Filter HTTP packets

har2txt -F 'http.verision == "1.1"' file.har

Use excluding filters also

har2txt -F '"rambler.ru" not in http.uri' file.har

Filter packets with destination IP. pcaper extracts data from har file, which contains destination IP (dst filed), but doesn't contain source IP, source and destination ports.

har2txt -F 'http.dst == "1.1.1.1"' file.har

Print statistics about counted requests:

har2txt -S -F 'http.dst == "10.10.10.10' file.har

Stats:
    total: 1
    complete: 1
    incorrect: 0
    incomplete: 0

About

Read pcap and har files and assemble HTTP requests

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%