top of page
Writer's pictureH-Barrio

Visitor ID & Location for Your Pythonanywhere web app

If you have a web app deployed at www.pythonanywhere.com, you may need a simple visitor analysis tool. This post will show how to build a tool for such a task in the form of a jupyter notebook. We will need the following modules:

import os
from os.path import isfile, join
import gzip
import geocoder
from ipwhois import IPWhois
from collections import Counter

Operating system tools os to locate our files, os.path tools to check for the existence of the correct log files that we need to analyze, gzip to unzip those files, and geocoder and ipwhois to identify the location and identity of a visiting IP at our web app. Counter to count visits. First of all, we need to locate our server's log files, always in /var/log for the pythonanywhere server, if they are files and correspond to access log files if they contain the ".access." string, you, dear reader, could chain the search in a single list comprehension if you favor acrobatic Python:

LOG_DIR = '/var/log'
onlyfiles = [f for f in os.listdir(LOG_DIR) if isfile(join(LOG_DIR, f))]
access_logs = [os.path.join(LOG_DIR, f) for f in onlyfiles if '.access.' in f]

These access logs should be a mix of zipped, older files and unzipped newer access files. We will open and read the files depending on their nature and grab the first element of each line; this contains visiting IP as bytes object that we can decode into a string:

visiting_ips = []

for f in access_logs:
    if ".gz" in f:
        o = gzip.open(f, 'r')
    else:
        o = open(f, 'r')
        
    with o as file:
        print(f)
        for line in file.readlines():
            ip = line.split()[0]
            print(f"IP: {ip}")
            if isinstance(ip, bytes): ip = ip.decode()
            visiting_ips.append(ip)

Our unique visitors are in the set of the visiting ips list:

unique_visitors = set(visiting_ips)

We can traverse this list to try and grab the whois information for the IP using IPWhois and adding the results into a dictionary for future access:

counts = Counter(visiting_ips)
visitors = {}

for ip in unique_visitors:    
    try:
        obj = IPWhois(ip)
        results = obj.lookup_rdap(depth=1)
    except:
        print('--------------------------------------')
        print('IPWhois failed, continue to next IP...')
        continue
    
    print('--------------------')
    print(f'IP: {ip}')
    print(f'Visits: {counts[ip]}')
    geoip = geocoder.ip(ip)    
    print(geoip.city)
    visitors[ip] = [counts[ip], geoip.city, results['asn_description']]
    
    
    print(results['asn_description'])

Optionally, we can now store our "visitors" dictionary as a JSON file; we will need the extra json module import:

import json

with open("site_visitors.json", "w") as f:
    json.dump(visitors, f, indent=4)

The JSON file will be located in the same filter that holds your jupyter notebook inside your pythonanywhere file system; this can be found using:

os.getcwd()

In a follow-up post, we will display these visitors on a map for quick visualization. The notebook for this post is inside Google Colab in this link. It will not work when run on Colab; it needs to be executed from within your pythonanywhere file system.



Do not hesitate to contact us if you require quantitative model development, deployment, verification, or validation. We will also be glad to help you with your machine learning or artificial intelligence challenges when applied to asset management, automation, or intelligence gathering from satellite, drone, or fixed-point imagery. Also, check our AI-Powered Spanish public tender search application using sentence similarity analysis to provide better tender matches to selling companies.

65 views0 comments

Recent Posts

See All

Comments


bottom of page