Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new flow risk NDPI_ANONYMOUS_SUBSCRIBER #1462

Merged
merged 1 commit into from
Feb 28, 2022

Conversation

IvanNardi
Copy link
Collaborator

@IvanNardi IvanNardi commented Feb 27, 2022

The main goal of a DPI engine is usually to determine "what", i.e. which
types of traffic flow on the network.
However the applications using DPI are often interested also in "who",
i.e. which "user/subscriber" generated that traffic.

The association between a flow and a subscriber is usually done via some
kind of DHCP/GTP/RADIUS/NAT mappings. In all these cases the key element
of the flow used to identify the user is the source ip address.

That usually happens for the vast majority of the traffic.

However, depending on the protocols involved and on the position on the net
where the traffic is captured, the source ip address might have been
changed/anonymized. In that case, that address is useless for any
flow-username association.

Example: iCloud Private Relay traffic captured between the exit relay and
the server.
See the picture at page 5 on:
https://www.apple.com/privacy/docs/iCloud_Private_Relay_Overview_Dec2021.PDF

This commit adds new generic flow risk NDPI_ANONYMOUS_SUBSCRIBER hinting
that the ip addresses shouldn't be used to identify the user associated
with the flow.
As a first example of this new feature, the entire list of the relay ip
addresses used by Private Relay is added.

A key point to note is that list is NOT used for flow classification
(unlike all the other ip lists present in nDPI) but only for setting this
new flow risk.

TODO: IPv6

Copy link
Collaborator

@utoni utoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Does it make sense to set this risk not only for iCloud relays, but also for other anonymization services e.g. TOR or if a HTTP Proxy is detected?
  2. I do not see any NDPI_SET_BIT. How is this risk set?

doc/flow_risks.rst Outdated Show resolved Hide resolved
The main goal of a DPI engine is usually to determine "what", i.e. which
types of traffic flow on the network.
However the applications using DPI are often interested also in "who",
i.e. which "user/subscriber" generated that traffic.

The association between a flow and a subscriber is usually done via some
kind of DHCP/GTP/RADIUS/NAT mappings. In all these cases the key element
of the flow used to identify the user is the source ip address.

That usually happens for the vast majority of the traffic.

However, depending on the protocols involved and on the position on the net
where the traffic is captured, the source ip address might have been
changed/anonymized. In that case, that address is useless for any
flow-username association.

Example: iCloud Private Relay traffic captured between the exit relay and
the server.
See the picture at page 5 on:
https://www.apple.com/privacy/docs/iCloud_Private_Relay_Overview_Dec2021.PDF

This commit adds new generic flow risk `NDPI_ANONYMOUS_SUBSCRIBER` hinting
that the ip addresses shouldn't be used to identify the user associated
with the flow.
As a first example of this new feature, the entire list of the relay ip
addresses used by Private Relay is added.

A key point to note is that list is NOT used for flow classification
(unlike all the other ip lists present in nDPI) but only for setting this
new flow risk.

TODO: IPv6
@IvanNardi
Copy link
Collaborator Author

  1. Does it make sense to set this risk not only for iCloud relays, but also for other anonymization services e.g. TOR or if a HTTP Proxy is detected?

Yeah, the idea is to incrementally extend this feature with other services.
The difficult part is how to detect them, IMO
Let's talk about TOR for example: we should have an ip list with the "ingress" nodes (to classify the flow as NDPI_PROTOCOL_TOR) and another list with the "egress" nodes (to set this new risk). I don't know if that is even feasible in the first place, I am not TOR expert.
Similarly for the HTTP Proxy: are we able to distinguish the two "legs", before and after the proxy?

With PrivateRelay we are able to identify the flows before the "proxies" (via SNI) and the (different) flows after it (via this new ip list)

I think that when nDPI classified a flow as NDPI_PROTOCOL_HTTP_PROXY it is the first leg, i.e. the source address really identified the user.
Not idea which TOR list are we using right now...

@sonarcloud
Copy link

sonarcloud bot commented Feb 28, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@IvanNardi IvanNardi changed the title Add a new flow risk NDPI_ANONYMOUS_USER Add a new flow risk NDPI_ANONYMOUS_SUBSCRIBER Feb 28, 2022
@utoni
Copy link
Collaborator

utoni commented Feb 28, 2022

Let's talk about TOR for example: we should have an ip list with the "ingress" nodes (to classify the flow as NDPI_PROTOCOL_TOR) and another list with the "egress" nodes (to set this new risk). I don't know if that is even feasible in the first place, I am not TOR expert.

It should be possible to use the publicly available list of TOR exit nodes and set the risk according to their address tuple. Or is that not how the feature is meant to be used? Anyway, it might be better to solve this in a follow-up, so this PR does not get bloated.

@IvanNardi
Copy link
Collaborator Author

It should be possible to use the publicly available list of TOR exit nodes and set the risk according to their address tuple. Or is that not how the feature is meant to be used?

You got that right; that is the general idea. I know that there are lists with (only) the exit nodes, but I don't know if there are lists only with the "ingress" nodes (to flow classification).
I will try something...

@lucaderi
Copy link
Member

I think the idea and the PR are great and I believe they can be merged. As you said probably some extra work will be necessary in order to complete it

@utoni utoni merged commit 7a7e4ee into ntop:dev Feb 28, 2022
@IvanNardi IvanNardi deleted the anonymous-user branch February 28, 2022 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants