Is IP address retrieval authorized in terms of users' privacy? - web-services

I'm currently new developping large scale webservices and I'd like to retrieve IP addresses from visitors to make some stats about the country/state of origin.
Is it allowed to take IP addresses from clients for internal use?
As this is a kind of personal information, I wonder if it is legal or not retrieving it.

It's not possible for you not to know the client IP (because your site couldn't work without it), but you don't have to keep it. From a GDPR perspective, data is only "personal data" if it can be linked to an individual (even indirectly), so for example you could take the client IP, do some kind of GeoIP lookup on it (preferably local), and then increment a country counter. Then you can simply discard the IP, and the aggregate data you retain has no way of being connected back to an individual, so it's not personal data.
A very simple approach would be a table like this:
Country
Count
France
2
Germany
4
USA
10
So you would just bump the count for the country each time. This gives you the data you're after, but without any privacy impact for your users, and no GDPR exposure.

Related

How to keep track of visitors/auth users/ip address activities

We need to start keeping track of user activities across the website. This means a user with inappropriate behavior should be marked somehow. First of all, how can I actually catch if an IP address is accessing the website from different browsers? Can this be done with cookies? Can this be done by keeping track of IP address's user agent activity? Something like keeping data inside a file/database table and per 2-3 days analyze this data and decide which accounts to delete.

presence of bitcoin addresses on blockchain

I recently read some where the speaker mentioned bitcoin doesn't store addresses in the block chain. If addresses do not exist on the blockchain, how can you check their balances on - for example - blockchain com?
Payment addresses stored within transactions, and transactions stored within blockchain. Thus, speaker is not right - addresses stored in the blockchain, and by scanning the blockchain, you can fetch all activity for any address. Of course, balance - just result (sum) of all activities.
However, Bitcoin (or other bitcoin-like crypto) does not build it's own address index, and there is impossible quickly fetch balance or activity history for some specific address directly from a node. Explorers scans blockchain and build his own transaction history withing his own database. Of course, when new block comes in, explorer updates his own DB, and by this way - he maintains actual history for all accounts.

DynamoDB - Reducing number of queries

After my users log in the app makes too many requests to DynamoDB and I am thinking about different ways to reduce the number of calls.
The app allows user to trigger certain alerts that get sent to other users. For instance: "Shipment received, come to the deck", "Shipment completed", etc.
These are the calls made:
Get company's software license expiration date.
Get the computer's location in the building (i.e. "Office A").
Get the kinds of alerts that can be triggered (i.e. "Shipment received, come to the deck", "Shipment completed", etc).
Get information about the user (i.e. company teams the user belongs to, and admin level the user has (which can be 0, 1, 2, or 3).
Potential solutions I have though about:
Put the company's license expiration date as an attribute of each computer (This would reduce the number of queries by 1). However, if I need to update the company's license expiration date, then I need to update it for EVERY SINGLE computer I have in the system, which sounds impractical to me since I may have 200, 300 or perhaps even more computers in the database.
Add the company's license expiration date as an attribute of the alerts (This would reduce the number of queries by 1); which seems more reasonable because there are only about 15 different kinds of alerts, so if I need to change the license expiration date later on, it is not too bad.
Cache information on the user's device; however, I can't seem to find a good strategy to keep the information stored locally as updated as possible.
I still think these 3 options do not sound too good, so I am hoping someone can point me in the right direction. Is there a good way to reduce the number of calls? I am retrieving information about 4 different entities (license, computer, alert, user), should I leave those 4 calls after users log in?
here are few things that can be done wrt each component.
Get information about the user
keep it in session store and whenever details changes update the store. session stores are usually implemented using cache like redis.
Computer location
Keep it in a distributed cache like redis. lazily initialise it. and whenever new write happens to computer location (rare IMO) remove the entry from redis using dynamodb streams and aws lambda.
Kind of alerts
Same as Computer location
License expiration date
If possible don't allow license expiry date (issue a new one for these cases, so that traceability is maintained.) and cache licence expiry forever. OR same as Computer location.

is Last 4-digits of credit card and Expiry Date storage allowed in PCI-DSS?

We need to store last 4 digits of credit card, (in order to let customers know which card they have used?) and expiry date (to notify customers that their card is about to expire) for our subscription/recurring payment based SaaS application.
are those two data storage allowed in PCI DSS? Please answer with reference/link to official website or document.
Please note: We are not storing Name On Card and CVV numbers
You should be ok w regard to PCI regulations.
This table lays out what data can be stored:
https://www.pcisecuritystandards.org/pdfs/pci_fs_data_storage.pdf
"If required for business purposes, the cardholder’s name, PAN, expiration date, and service code may be stored as long as they are protected in accordance with PCI DSS requirements."
-edit-
According to the bottom table in that doc, it says you should be able to store those elements. Since you are not storing full PAN, Regulation 3.4 shouldn't apply to the other elements.
If it helps, we got Level 1 certified and we store last 4 and expiration date in clear text. You don't need audited unless you are Level 1 (assuming Merchant here, not Service Provider).
From what I am reading within the PCI Data Storage Do's and Don'ts PDF (https://www.pcisecuritystandards.org/pdfs/pci_fs_data_storage.pdf)
You are able to store the expiration date, service code, and cardholder name so long as you do NOT store the PAN.
Direct quote from the PDF:
These data elements must be protected if stored in conjunction with the PAN. This protection should be per PCI DSS requirements for general protection of the cardholder data environment. Additionally, other legislation (e.g., related to consumer personal data protection, privacy, identity theft, or data security) may require speci c protection of this data, or proper disclosure of a company’s practices if consumer- related personal data is being collected during the course of business. PCI DSS, however, does not apply if PANs are not stored, processed, or transmitted.

How can I have 5 visits with with zero unique visitors?

According to Google Analytics, I had 5 visits from zero unique visitors. Is that a bug or did I perhaps implement something wrongly? Or hasn't the data processing finished yet (I created this view 2 days ago)?
The view is based on an include custom filter that's supposed to include only traffic from any of three ip addresses. The regex I used for this is
62\.58\.32\.193|77\.172\.143\.12$|213\.125\.166\.98
My best guess would be the way Google defines unique Visitors. Sometimes I have been visiting my own website periodically and I ended up showing up as a unique visitor (My site isn't so popular so it's easy for me to track that). I would either have to say that it has to do with the nature of visits or the actual way of unique visitors. According to google this is how the find unique visitors
The other Unique Visitors metric calculation (Calculation #2) is based
on the __utma cookie. Calculation #2 is used when segmenting the
Audience Overview report or when viewing Unique Visitors over any
dimension other than date. As such, Calculation #2 is used in custom
reports to allow for the calculation of Unique Visitors over any
dimension, such as browser, city, or traffic source.
source: https://support.google.com/analytics/answer/2992042?hl=en
Occasionally, there are problems with Google Analytics reporting. Check the product forums. For example, here is an issue that happened on Nov 11, 2013:
http://productforums.google.com/forum/#!topic/analytics/fsurDK8AOcY
This issue can also crop up when you are using the page dimension. Unique visitors are only assigned to the first page in a visit as described here. But, it doesn't seem like that is the case for you.
Finally, its possible, analogous to the page dimension situation, that unique visitors are only assigned to the first IP address that a visitor came from. If that is true, then if the people who came to your site had previously come from a different IP address, then they wouldn't show up as unique visitors in your filter.