-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect and remove bad image URLS for person avatars #218
Comments
(on above link, click to state electeds to see a broken image) |
We actually have an image URL for that person, but it's bad: http://www.house.state.oh.us/houseImages/129/headshots/h32.jpg It redirects and then gives a 404. When we don' have any image URL, it does give the default blank avatar. I'll have a think about a strategy for testing and pruning out image URS so this is avoided. |
This is still important, b/c electeds will want to see their faces where possible, and a fair amount of photos are missing it seems from the API. E.g. an ally, Brad Lander :: https://www.dropbox.com/s/y1lpqcnk5pxg9zi/Screenshot%202014-01-22%2014.00.38.png |
Just because it is important doesn't mean it is possible. Looking at the example of Brad Lander… At this point most councilmember data (i.e. outside of Philly, San José, and Chicago) is acquired from Google Civic Info API. If the API has an image for the person, we grab it. Google has recently updated the data for NYC and now has a photo url for Brad Lander. Because Brad Lander hasn't been loaded into the production site's data yet, when someone does go to ask him a question, we should get his photo url. I.e. it wasn't available when we loaded his data on preview, but is now. However, when we look at another councilmember, Antonio Reynoso, Google Civic Info API does not have an image url for him and we are shit out of luck. If Google Civic Info API later adds a photo_url for Antonio Reynoso, the next time someone has him as a potential recipient for a new question with an address lookup, we'll grab his photo_url and save it. Having said all that, the problem of not having images available for some elected officials, is not the actually what this, issue #218, is about! It's about when we DO have a photo url for a person, but it is a BAD URL. That means what we want to do is clear out photo urls anywhere in our elected officials where there IS a photo url for a person, but it is no longer something that will return an actual image from the web. What this is going to require is running something through almost all our data that requests almost EVERY person's photo url and checks to see if it still any good. If not, it should remove the photo url value from the person. No photo_url value will actually allow our subsequent calls to Google Civic Info API to populate the person's photo_url that hopefully is more current and works. |
As an interim measure, I have gone ahead and cleared out the 148 photo urls from state legislators that pointed at bad Ballotpedia urls with this command in the rails console for both preview (aka staging) and production: Person.where(photo_url: /ballotpedia/i).each { |person| person.photo_url = nil; person.save! } If Google Civic Info API has photo urls for these state legislators and anyone enters an address that bring them up through Google Civic Info API, we should then get the good photo url for them. |
Could you cache photos when you first find that they're valid, and have Cloudfront sit in front of the cached copies? Then your problem at least becomes stale photos rather than missing photos - which you could mitigate by re-checking your photos on a regular basis and only updating photos when you have a heuristic that tells you it's actually a |
We use Cloudfront now in front of images in most cases for elected officials, but I have to admit I haven't seen if it is used as best it can be. I've only tweaked what was there before I came on the scene as needed rather than figuring out if we are doing the best thing possible. When I get a breather I'll take a look at the assets as a whole and see if they can be handled in a less fragile way. Images going missing is common for opengov data, so have to plan for it. |
All understood. I haven't used Cloudfront myself either. I just looked at the URLs, and they look like this:
So CloudFront is sitting right in front of the original URL, which can change underneath. My suggestion is to cache them at an askthem.io URL, an then continue to use CloudFront but have it sit in front of that URL. Just so the problem is obvious, here's a cropped screen of the state of Indiana's state people: |
Either way we are going to have to work through our data and clear out the bad URLs as we don't have caches for them. Once that is done, the placeholder image will be used instead of the broken icon. I would actually like to build in some reporting of bad images into the periodically run sweeper or whatever so we can alert those in charge of the source data. |
Originally issue name: default avatars when no photo (or something like that)
On many pages ::
http://preview.askthem.io/locator?q=cincinnati%2C+oh#
... images are showing as broken when we don't have the avatar from our various data sources. Can we have them be uniformly the blank-person-outline default image, when not definitely available? Realize easier typed than done, haha.
The text was updated successfully, but these errors were encountered: