Skip to content

Commit

Permalink
Add files via upload
Browse files Browse the repository at this point in the history
  • Loading branch information
pdwaggoner authored Sep 30, 2018
1 parent 03ff4c6 commit 16c3642
Show file tree
Hide file tree
Showing 7 changed files with 128 additions and 9 deletions.
21 changes: 21 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Package: rIP
Type: Package
Title: Passes an array of IP addresses to iphub.info and returns a dataframe with details of IP
Version: 0.1.0
Authors@R: c(person("Ryan", "Kennedy", role = c("aut", "cre"), email = "[email protected]"),
person("Philip", "Waggoner", role = "aut", email = "[email protected]"),
person("Scott", "Clifford", role = "ctb"))
Maintainer: Ryan Kennedy <[email protected]>
Description: Takes as its input an array of IPs and the user's X-Key, passes these to iphub.info, and returns a dataframe with the ip (used for merging), country code, country name, asn, isp, block, and hostname.
Especially important in this is the variable "block", which gives a score indicating whether the IP address is likely from a server farm and should be excluded from the data. It is codes 0 if the IP is residential/unclassified (i.e. safe IP), 1 if the IP is non-residential IP (hostping provider, proxy, etc. -- should likely be excluded), and 2 for non-residential and residential IPs (more stringent, may flag innocent respondents).
The recommendation from iphub.info is to block or exclude those who score block = 1.
Imports: httr, utils
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.0.1
NeedsCompilation: no
Packaged: 2018-09-28 21:00:38 UTC
Author: Ryan Kennedy [aut, cre],
Philip Waggoner [aut],
Scott Clifford [ctb]
2 changes: 2 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
YEAR: 2018
COPYRIGHT HOLDER: Ryan Kennedy
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
exportPattern("^[[:alpha:]]+")
42 changes: 42 additions & 0 deletions R/rIP.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#' Passes an array of IP addresses to iphub.info and returns a dataframe with details of IP
#'
#' Makes a call to an IP address verification service (iphub.info) that returns the information on the IP address, including the internet service provider (ISP) and whether it is likely a server farm being used to disguise a respondent's location.
#'@usage getIPinfo(d, "i", "key")
#'@param d Data frame where IP addresses are stored
#'@param i Name of the vector in data frame, d, corresponding to IP addresses in quotation marks
#'@param key User's X-key in quotation marks
#'@details Takes an array of IPs and the user's X-Key, and passes these to iphub.info. Returns a dataframe with the IP address (used for merging), country code, country name, asn, isp, block, and hostname.
#'@return ipDF A dataframe with the IP address, country code, country name, asn, isp, block, and hostname.
#'@note Users must have an active iphub.info account with a valid X-key.
#'@examples
#'id <- c(1,2,3,4) # fake respondent id's
#'ips <- c(123.232, 213.435, 234.764, 543.765) # fake ips
#'data <- data.frame(id,ips)
#'getIPinfo(data, "ips", "MzI3NDpJcVJKSTdIdXpQSUJLQVhZY1RvRxaXFsFW3jS3xcQ")
#'@export
getIPinfo <- function(d, i, key){
if (!requireNamespace("httr", quietly = TRUE)) {
stop("Package \"httr\" needed for this function to work. Please install it.",
call. = FALSE)
}
if (!requireNamespace("utils", quietly = TRUE)) {
stop("Package \"utils\" needed for this function to work. Please install it.",
call. = FALSE)
}
ips <- unique(d[ ,i])
options(stringsAsFactors = FALSE)
url <- "http://v2.api.iphub.info/ip/"
pb <- utils::txtProgressBar(min = 0, max = length(ips), style = 3)
ipDF <- c()
for (i in 1:length(ips)) {
ipInfo <- httr::GET(paste0(url, ips[i]), httr::add_headers(`X-Key` = key))
infoVector <- unlist(httr::content(ipInfo))
ipDF <- rbind(ipDF, infoVector)
utils::setTxtProgressBar(pb, i)
}
close(pb)
ipDF <- data.frame(ipDF)
rownames(ipDF) <- NULL

return(ipDF)
}
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
## `rIP` An `R` package to detect responses from server farms on MTurk surveys

Takes an array of IPs and the user's X-Key, and passes these to `iphub.info`. Returns a dataframe with the IP address (used for merging), country code, country name, asn, isp, block, and hostname.

Especially important is the variable `block`, which gives a score indicating whether the IP address is likely from a server farm and should be excluded from the data. It is codes 0 if the IP is residential/unclassified (i.e. safe IP), 1 if the IP is non-residential IP (hostping provider, proxy, etc. -- should likely be excluded), and 2 for non-residential and residential IPs (more stringent, may flag innocent respondents).

The recommendation from `iphub.info` is to block or exclude those who score `block = 1`.

Credit to @tylerburleigh for pointing out the utility of `iphub.info`. His method for incorporating this information into Qualtrics surveys can be found [here](https://twitter.com/tylerburleigh/status/1042528912511848448?s=19).
`rIP` detects likely responses from server farms on MTurk surveys.

Takes as its input an array of IPs and the user's X-Key, passes these to iphub.info, and returns a dataframe with the ip (used for merging), country code, country name, asn, isp, block, and hostname.

Especially important in this is the variable "block", which gives a score indicating whether the IP address is likely from a server farm and should be excluded from the data. It is codes 0 if the IP is residential/unclassified (i.e. safe IP), 1 if the IP is non-residential IP (hostping provider, proxy, etc. - should likely be excluded), and 2 for non-residential and residential IPs (more stringent, may flag innocent respondents).

The recommendation from iphub.info is to block or exclude those who score block = 1.

We thank @tylerburleigh for pointing out the utility of iphub.info. His method for incorporating this information into Qualtrics surveys can be found [here](https://twitter.com/tylerburleigh/status/1042528912511848448?s=19).
33 changes: 33 additions & 0 deletions man/getIPinfo.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 20 additions & 0 deletions rIP.Rproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Version: 1.0

RestoreWorkspace: Default
SaveWorkspace: Default
AlwaysSaveHistory: Default

EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8

RnwWeave: Sweave
LaTeX: pdfLaTeX

AutoAppendNewline: Yes
StripTrailingWhitespace: Yes

BuildType: Package
PackageUseDevtools: Yes
PackageInstallArgs: --no-multiarch --with-keep.source

0 comments on commit 16c3642

Please sign in to comment.