[linux-yocto] [PATCH 1/2] netatop: Initial integration
Bruce Ashfield
bruce.ashfield at windriver.com
Mon Apr 9 12:19:52 PDT 2018
On 2018-04-05 3:08 PM, Bruce Ashfield wrote:
> On 2018-03-28 11:25 PM, He Zhe wrote:
>> This is a net statistics collector for atop from:
>> http://atoptool.nl/netatop.php
>> https://atoptool.nl/download/netatop-2.0.tar.gz
>>
>> Two changes:
>> Adjust pathes for netatop.h and netatopversion.h
>> Add module_init and module_exit
>
> Which kernel versions have you tested this on ? Is there a way we could get
> a README that describes how to build and test it ?
Also, is there a recipe that builds this that I can see/test ?
Bruce
>
> Bruce
>
>>
>> Signed-off-by: He Zhe <zhe.he at windriver.com>
>> ---
>> drivers/staging/netatop/netatop.c | 1843
>> ++++++++++++++++++++++++++++++
>> drivers/staging/netatop/netatop.h | 47 +
>> drivers/staging/netatop/netatopversion.h | 2 +
>> 3 files changed, 1892 insertions(+)
>> create mode 100644 drivers/staging/netatop/netatop.c
>> create mode 100644 drivers/staging/netatop/netatop.h
>> create mode 100644 drivers/staging/netatop/netatopversion.h
>>
>> diff --git a/drivers/staging/netatop/netatop.c
>> b/drivers/staging/netatop/netatop.c
>> new file mode 100644
>> index 000000000000..3baa6520416a
>> --- /dev/null
>> +++ b/drivers/staging/netatop/netatop.c
>> @@ -0,0 +1,1843 @@
>> +/*
>> +** This module uses the netfilter interface to maintain statistics
>> +** about the network traffic per task, on level of thread group
>> +** and individual thread.
>> +**
>> +** General setup
>> +** -------------
>> +** Once the module is active, it is called for every packet that is
>> +** transmitted by a local process and every packet that is received
>> +** from an interface. Not only the packets that contain the user data
>> +** are passed but also the TCP related protocol packets (SYN, ACK, ...).
>> +**
>> +** When the module discovers a packet for a connection (TCP) or local
>> +** port (UDP) that is new, it creates a sockinfo structure. As soon as
>> +** possible the sockinfo struct will be connected to a taskinfo struct
>> +** that represents the proces or thread that is related to the socket.
>> +** However, the task can only be determined when a packet is
>> transmitted,
>> +** i.e. the module is called during system call handling in the context
>> +** of the transmitting process. At that moment the tgid (process) and
>> +** pid (thread) can be obtained from the process administration to
>> +** be stored in the module's own taskinfo structs (one for the process,
>> +** one for the thread).
>> +** For the time that the sockinfo struct can not be related to a
>> taskinfo
>> +** struct (e.g. when only packets are received), counters are maintained
>> +** temporarily in the sockinfo struct. As soon as a related taskinfo
>> struct
>> +** is discovered when the task transmits, counters will be maintained in
>> +** the taskinfo struct itself.
>> +** When packets are only received for a socket (e.g. another machine is
>> +** sending UDP packets to the local machine) while the local task
>> +** never responds, no match to a process can be made and the packets
>> +** remain unidentified by the netatop module. At least one packet should
>> +** have been sent by a local process to be able to match packets for
>> such
>> +** socket.
>> +** In the file /proc/netatop counters can be found that show the total
>> +** number of packets sent/received and how many of these packets were
>> +** unidentified (i.e. not accounted to a process/thread).
>> +**
>> +** Garbage collection
>> +** ------------------
>> +** The module uses a garbage collector to cleanup the unused sockinfo
>> +** structs if connections do not exist any more (TCP) or have not been
>> +** used for some time (TCP/UDP).
>> +** Furthermore, the garbage collector checks if the taskinfo structs
>> +** still represent existing processes or threads. If not, the
>> taskinfo struct
>> +** is destroyed (in case of a thread) or it is moved to a separate
>> list of
>> +** finished processes (in case of a process). Analysis programs can read
>> +** the taskinfo of such finished process. When the taskinfo of a
>> finished
>> +** process is not read within 15 seconds, the taskinfo will be
>> destroyed.
>> +**
>> +** A garbage collector cycle can be triggered by issueing a getsockopt
>> +** call from an analysis program (e.g. atop). Apart from that, a
>> time-based
>> +** garbage collector cycle is issued anyhow every 15 seconds by the
>> +** knetatop kernel thread.
>> +**
>> +** Interface with user mode
>> +** ------------------------
>> +** Programs can open an IP socket and use the getsockopt() system call
>> +** to issue commands to this module. With the command ATOP_GETCNT_TGID
>> +** the current counters can be obtained on process level (thread group)
>> +** and with the command ATOP_GETCNT_PID the counters on thread level.
>> +** For both commands, the tgid/pid has to be passed of the required
>> thread
>> +** (group). When the required thread (group) does not exist, an errno
>> ESRCH
>> +** is given.
>> +**
>> +** The command ATOP_GETCNT_EXIT can be issued to obtain the counters of
>> +** an exited process. As stated above, such command has to be issued
>> +** within 15 seconds after a process has been declared 'finished' by
>> +** the garbage collector. Whenever this command is issued and no exited
>> +** process is in the exitlist, the requesting process is blocked until
>> +** an exited process is available.
>> +**
>> +** The command NETATOP_FORCE_GC activates the garbage collector of the
>> +** netatop module to determine if sockinfo's of old connections/ports
>> +** can be destroyed and if taskinfo's of exited processes can be
>> +** The command NETATOP_EMPTY_EXIT can be issued to wait until the
>> exitlist
>> +** with the taskinfo's of exited processes is empty.
>> +**
>> ----------------------------------------------------------------------
>> +** Copyright (C) 2012 Gerlof Langeveld (gerlof.langeveld at atoptool.nl)
>> +**
>> +** This program is free software; you can redistribute it and/or modify
>> +** it under the terms of the GNU General Public License version 2 as
>> +** published by the Free Software Foundation.
>> +*/
>> +#include <linux/init.h>
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/netfilter.h>
>> +#include <linux/netfilter_ipv4.h>
>> +#include <linux/sched.h>
>> +#include <linux/skbuff.h>
>> +#include <linux/types.h>
>> +#include <linux/file.h>
>> +#include <linux/kthread.h>
>> +#include <linux/seq_file.h>
>> +#include <linux/version.h>
>> +#include <net/sock.h>
>> +#include <linux/ip.h>
>> +#include <linux/tcp.h>
>> +#include <linux/udp.h>
>> +#include <linux/proc_fs.h>
>> +#include <net/ip.h>
>> +#include <net/tcp.h>
>> +#include <net/udp.h>
>> +
>> +#include "netatop.h"
>> +#include "netatopversion.h"
>> +
>> +MODULE_LICENSE("GPL");
>> +MODULE_AUTHOR("Gerlof Langeveld <gerlof.langeveld at atoptool.nl>");
>> +MODULE_DESCRIPTION("Per-task network statistics");
>> +MODULE_VERSION(NETATOPVERSION);
>> +
>> +#define GCINTERVAL (HZ*15) // interval garbage collector
>> (jiffies)
>> +#define GCMAXUDP (HZ*16) // max inactivity for UDP
>> (jiffies)
>> +#define GCMAXTCP (HZ*1800) // max inactivity for TCP
>> (jiffies)
>> +#define GCMAXUNREF (HZ*60) // max time without taskref
>> (jiffies)
>> +
>> +#define SILIMIT (4096*1024) // maximum memory for
>> sockinfo structs
>> +#define TILIMIT (2048*1024) // maximum memory for
>> taskinfo structs
>> +
>> +#define NF_IP_PRE_ROUTING 0
>> +#define NF_IP_LOCAL_IN 1
>> +#define NF_IP_FORWARD 2
>> +#define NF_IP_LOCAL_OUT 3
>> +#define NF_IP_POST_ROUTING 4
>> +
>> +/*
>> +** struct that maintains statistics about the network
>> +** traffic caused per thread or thread group
>> +*/
>> +struct chainer {
>> + void *next;
>> + void *prev;
>> +};
>> +
>> +struct taskinfobucket;
>> +
>> +struct taskinfo {
>> + struct chainer ch;
>> +
>> + pid_t id; // tgid or pid
>> + char type; // 'g' (thread group) or
>> + // 't' (thread)
>> + unsigned char state; // see below
>> + char command[COMLEN];
>> + unsigned long btime; // start time of process
>> + unsigned long long exittime; // time inserted in exitlist
>> +
>> + struct taskcount tc;
>> +};
>> +
>> +static struct kmem_cache *ticache; // taskinfo cache
>> +
>> +// state values above
>> +#define CHECKED 1 // verified that task still exists
>> +#define INDELETE 2 // task exited but still in hash list
>> +#define FINISHED 3 // task on exit list
>> +
>> +/*
>> +** hash tables to find a particular thread group or thread
>> +*/
>> +#define TBUCKS 1024 // must be multiple of 2!
>> +#define THASH(x, t) (((x)+t)&(TBUCKS-1))
>> +
>> +struct taskinfobucket {
>> + struct chainer ch;
>> + spinlock_t lock;
>> +} thash[TBUCKS];
>> +
>> +static unsigned long nrt; // current number of taskinfo allocated
>> +static unsigned long nrt_ovf; // no taskinfo allocated due to
>> overflow
>> +static DEFINE_SPINLOCK(nrtlock);
>> +
>> +
>> +static struct taskinfo *exithead; // linked list of exited
>> processes
>> +static struct taskinfo *exittail;
>> +static DEFINE_SPINLOCK(exitlock);
>> +
>> +static DECLARE_WAIT_QUEUE_HEAD(exitlist_filled);
>> +static DECLARE_WAIT_QUEUE_HEAD(exitlist_empty);
>> +
>> +static unsigned long nre; // current number of taskinfo on
>> exitlist
>> +
>> +/*
>> +** structs that uniquely identify a TCP connection (host endian format)
>> +*/
>> +struct tcpv4_ident {
>> + uint32_t laddr; /* local IP address */
>> + uint32_t raddr; /* remote IP address */
>> + uint16_t lport; /* local port number */
>> + uint16_t rport; /* remote port number */
>> +};
>> +
>> +struct tcpv6_ident {
>> + struct in6_addr laddr; /* local IP address */
>> + struct in6_addr raddr; /* remote IP address */
>> + uint16_t lport; /* local port number */
>> + uint16_t rport; /* remote port number */
>> +};
>> +
>> +/*
>> +** struct to maintain the reference from a socket
>> +** to a thread and thread-group
>> +*/
>> +struct sockinfo {
>> + struct chainer ch;
>> +
>> + unsigned char last_state; // last known state of socket
>> + uint8_t proto; // protocol
>> +
>> + union keydef {
>> + uint16_t udp; // UDP ident (only portnumber)
>> + struct tcpv4_ident tcp4; // TCP connection ident IPv4
>> + struct tcpv6_ident tcp6; // TCP connection ident IPv6
>> + } key;
>> +
>> + struct taskinfo *tgp; // ref to thread group
>> + struct taskinfo *thp; // ref to thread (or NULL)
>> +
>> + short tgh; // hash number of thread group
>> + short thh; // hash number of thread
>> +
>> + unsigned int sndpacks; // temporary counters in case
>> + unsigned int rcvpacks; // known yet
>> + unsigned long sndbytes; // no relation to process is
>> + unsigned long rcvbytes;
>> +
>> + unsigned long long lastact; // last updated (jiffies)
>> +};
>> +
>> +static struct kmem_cache *sicache; // sockinfo cache
>> +
>> +/*
>> +** hash table to find a socket reference
>> +*/
>> +#define SBUCKS 1024 // must be multiple of 2!
>> +#define SHASHTCP4(x) (((x).raddr+(x).lport+(x).rport)&(SBUCKS-1))
>> +#define SHASHUDP(x) ((x)&(SBUCKS-1))
>> +
>> +struct {
>> + struct chainer ch;
>> + spinlock_t lock;
>> +} shash[SBUCKS];
>> +
>> +static unsigned long nrs; // current number sockinfo allocated
>> +static unsigned long nrs_ovf; // no sockinfo allocated due to
>> overflow
>> +static DEFINE_SPINLOCK(nrslock);
>> +
>> +/*
>> +** various static counters
>> +*/
>> +static unsigned long icmpsndbytes;
>> +static unsigned long icmpsndpacks;
>> +static unsigned long icmprcvbytes;
>> +static unsigned long icmprcvpacks;
>> +
>> +static unsigned long tcpsndpacks;
>> +static unsigned long tcprcvpacks;
>> +static unsigned long udpsndpacks;
>> +static unsigned long udprcvpacks;
>> +static unsigned long unidentudpsndpacks;
>> +static unsigned long unidentudprcvpacks;
>> +static unsigned long unidenttcpsndpacks;
>> +static unsigned long unidenttcprcvpacks;
>> +
>> +static unsigned long unknownproto;
>> +
>> +static DEFINE_MUTEX(gclock);
>> +static unsigned long long gclast; // last garbage collection
>> (jiffies)
>> +
>> +static struct task_struct *knetatop_task;
>> +
>> +static struct timespec boottime;
>> +
>> +/*
>> +** function prototypes
>> +*/
>> +static void analyze_tcpv4_packet(struct sk_buff *,
>> + const struct net_device *, int, char,
>> + struct iphdr *, void *);
>> +
>> +static void analyze_udp_packet(struct sk_buff *,
>> + const struct net_device *, int, char,
>> + struct iphdr *, void *);
>> +
>> +static int sock2task(char, struct sockinfo *,
>> + struct taskinfo **, short *,
>> + struct sk_buff *, const struct net_device *,
>> + int, char);
>> +
>> +static void update_taskcounters(struct sk_buff *,
>> + const struct net_device *,
>> + struct taskinfo *, char);
>> +
>> +static void update_sockcounters(struct sk_buff *,
>> + const struct net_device *,
>> + struct sockinfo *, char);
>> +
>> +static void sock2task_sync(struct sk_buff *,
>> + struct sockinfo *, struct taskinfo *);
>> +
>> +static void register_unident(struct sockinfo *);
>> +
>> +static int calc_reallen(struct sk_buff *,
>> + const struct net_device *);
>> +
>> +static void get_tcpv4_ident(struct iphdr *, void *,
>> + char, union keydef *);
>> +
>> +static struct sockinfo *find_sockinfo(int, union keydef *, int, int);
>> +static struct sockinfo *make_sockinfo(int, union keydef *, int, int);
>> +
>> +static void wipesockinfo(void);
>> +static void wipetaskinfo(void);
>> +static void wipetaskexit(void);
>> +
>> +static void garbage_collector(void);
>> +static void gctaskexit(void);
>> +static void gcsockinfo(void);
>> +static void gctaskinfo(void);
>> +
>> +static void move_taskinfo(struct taskinfo *);
>> +static void delete_taskinfo(struct taskinfo *);
>> +static void delete_sockinfo(struct sockinfo *);
>> +
>> +static struct taskinfo *get_taskinfo(pid_t, char);
>> +
>> +static int getsockopt(struct sock *, int, void *, int *);
>> +
>> +static int netatop_open(struct inode *inode, struct file *file);
>> +
>> +/*
>> +** hook definitions
>> +*/
>> +static struct nf_hook_ops hookin_ipv4;
>> +static struct nf_hook_ops hookout_ipv4;
>> +
>> +/*
>> +** getsockopt definitions for communication with user space
>> +*/
>> +static struct nf_sockopt_ops sockopts = {
>> + .pf = PF_INET,
>> + .get_optmin = NETATOP_BASE_CTL,
>> + .get_optmax = NETATOP_BASE_CTL+6,
>> + .get = getsockopt,
>> + .owner = THIS_MODULE,
>> +};
>> +
>> +static struct file_operations netatop_proc_fops = {
>> + .open = netatop_open,
>> + .read = seq_read,
>> + .llseek = seq_lseek,
>> + .release = single_release,
>> + .owner = THIS_MODULE,
>> +};
>> +
>> +
>> +/*
>> +** hook function to be called for every incoming local packet
>> +*/
>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 10, 0)
>> +# if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 4, 0)
>> +# define HOOK_ARG_TYPE void *priv
>> +# else
>> +# define HOOK_ARG_TYPE const struct nf_hook_ops *ops
>> +# endif
>> +#else
>> +# define HOOK_ARG_TYPE unsigned int hooknum
>> +#endif
>> +
>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 1, 0)
>> +#define HOOK_STATE_ARGS const struct nf_hook_state *state
>> +
>> +#define DEV_IN state->in
>> +#define DEV_OUT state->out
>> +#else
>> +# ifndef __GENKSYMS__
>> +# define HOOK_STATE_ARGS const struct net_device *in, \
>> + const struct net_device *out, \
>> + const struct nf_hook_state *state
>> +# else
>> +# define HOOK_STATE_ARGS const struct net_device *in, \
>> + const struct net_device *out, \
>> + int (*okfn)(struct sk_buff *)
>> +# endif
>> +#define DEV_IN in
>> +#define DEV_OUT out
>> +#endif
>> +
>> +
>> +static unsigned int
>> +ipv4_hookin(HOOK_ARG_TYPE, struct sk_buff *skb, HOOK_STATE_ARGS)
>> +{
>> + struct iphdr *iph;
>> + void *trh;
>> +
>> + if (skb == NULL) // useless socket buffer?
>> + return NF_ACCEPT;
>> +
>> + /*
>> + ** get pointer to IP header and transport header
>> + */
>> + iph = (struct iphdr *)skb_network_header(skb);
>> + trh = ((char *)iph + (iph->ihl * 4));
>> +
>> + /*
>> + ** react on protocol number
>> + */
>> + switch (iph->protocol) {
>> + case IPPROTO_TCP:
>> + tcprcvpacks++;
>> + analyze_tcpv4_packet(skb, DEV_IN, 0, 'i', iph, trh);
>> + break;
>> +
>> + case IPPROTO_UDP:
>> + udprcvpacks++;
>> + analyze_udp_packet(skb, DEV_IN, 0, 'i', iph, trh);
>> + break;
>> +
>> + case IPPROTO_ICMP:
>> + icmprcvpacks++;
>> + icmprcvbytes += skb->len + DEV_IN->hard_header_len + 4;
>> + break;
>> +
>> + default:
>> + unknownproto++;
>> + }
>> +
>> + // accept every packet after stats gathering
>> + return NF_ACCEPT;
>> +}
>> +
>> +/*
>> +** hook function to be called for every outgoing local packet
>> +*/
>> +static unsigned int
>> +ipv4_hookout(HOOK_ARG_TYPE, struct sk_buff *skb, HOOK_STATE_ARGS)
>> +{
>> + int in_syscall = !in_interrupt();
>> + struct iphdr *iph;
>> + void *trh;
>> +
>> + if (skb == NULL) // useless socket buffer?
>> + return NF_ACCEPT;
>> +
>> + /*
>> + ** get pointer to IP header and transport header
>> + */
>> + iph = (struct iphdr *)skb_network_header(skb);
>> + trh = skb_transport_header(skb);
>> +
>> + /*
>> + ** react on protocol number
>> + */
>> + switch (iph->protocol) {
>> + case IPPROTO_TCP:
>> + tcpsndpacks++;
>> + analyze_tcpv4_packet(skb, DEV_OUT, in_syscall, 'o', iph, trh);
>> + break;
>> +
>> + case IPPROTO_UDP:
>> + udpsndpacks++;
>> + analyze_udp_packet(skb, DEV_OUT, in_syscall, 'o', iph, trh);
>> + break;
>> +
>> + case IPPROTO_ICMP:
>> + icmpsndpacks++;
>> + icmpsndbytes += skb->len + DEV_OUT->hard_header_len + 4;
>> + break;
>> +
>> + default:
>> + unknownproto++;
>> + }
>> +
>> + // accept every packet after stats gathering
>> + return NF_ACCEPT;
>> +}
>> +
>> +/*
>> +** generic function (for input and output) to analyze the current packet
>> +*/
>> +static void
>> +analyze_tcpv4_packet(struct sk_buff *skb,
>> + const struct net_device *ndev, // interface description
>> + int in_syscall, // called during system call?
>> + char direction, // incoming ('i') or outgoing ('o')
>> + struct iphdr *iph, void *trh)
>> +{
>> + union keydef key;
>> + struct sockinfo *sip;
>> + int bs; // hash bucket for sockinfo
>> + unsigned long sflags;
>> +
>> + /*
>> + ** determine tcpv4_ident that identifies this TCP packet
>> + ** and calculate hash bucket in sockinfo hash
>> + */
>> + get_tcpv4_ident(iph, trh, direction, &key);
>> +
>> + /*
>> + ** check if we have seen this tcpv4_ident before with a
>> + ** corresponding thread and thread group
>> + */
>> + bs = SHASHTCP4(key.tcp4);
>> +
>> + spin_lock_irqsave(&shash[bs].lock, sflags);
>> +
>> + if ( (sip = find_sockinfo(IPPROTO_TCP, &key, sizeof key.tcp4, bs))
>> + == NULL) {
>> + // no sockinfo yet: create one
>> + if ( (sip = make_sockinfo(IPPROTO_TCP, &key,
>> + sizeof key.tcp4, bs)) == NULL) {
>> + if (direction == 'i')
>> + unidenttcprcvpacks++;
>> + else
>> + unidenttcpsndpacks++;
>> + goto unlocks;
>> + }
>> + }
>> +
>> + if (skb->sk)
>> + sip->last_state = skb->sk->sk_state;
>> +
>> + /*
>> + ** if needed (re)connect the sockinfo to a taskinfo and update
>> + ** the counters
>> + */
>> +
>> + // connect to thread group and update
>> + if (sock2task('g', sip, &sip->tgp, &sip->tgh,
>> + skb, ndev, in_syscall, direction)) {
>> + // connect to thread and update
>> + (void) sock2task('t', sip, &sip->thp, &sip->thh,
>> + skb, ndev, in_syscall, direction);
>> + }
>> +
>> +unlocks:
>> + spin_unlock_irqrestore(&shash[bs].lock, sflags);
>> +}
>> +
>> +
>> +/*
>> +** generic function (for input and output) to analyze the current packet
>> +*/
>> +static void
>> +analyze_udp_packet(struct sk_buff *skb,
>> + const struct net_device *ndev, // interface description
>> + int in_syscall, // called during system call?
>> + char direction, // incoming ('i') or outgoing ('o')
>> + struct iphdr *iph, void *trh)
>> +{
>> + struct udphdr *udph = (struct udphdr *)trh;
>> + uint16_t udplocal = (direction == 'i' ?
>> + ntohs(udph->dest) : ntohs(udph->source));
>> + int bs; // hash bucket for sockinfo
>> +
>> + union keydef key;
>> + struct sockinfo *sip;
>> + unsigned long sflags;
>> +
>> + /*
>> + ** check if we have seen this local UDP port before with a
>> + ** corresponding thread and thread group
>> + */
>> + key.udp = udplocal;
>> + bs = SHASHUDP(udplocal);
>> +
>> + spin_lock_irqsave(&shash[bs].lock, sflags);
>> +
>> + if ( (sip = find_sockinfo(IPPROTO_UDP, &key, sizeof key.udp, bs))
>> + == NULL) {
>> + // no sockinfo yet: create one
>> + if ( (sip = make_sockinfo(IPPROTO_UDP, &key,
>> + sizeof key.udp, bs)) == NULL) {
>> + if (direction == 'i')
>> + unidentudprcvpacks++;
>> + else
>> + unidentudpsndpacks++;
>> + goto unlocks;
>> + }
>> + }
>> +
>> + /*
>> + ** if needed (re)connect the sockinfo to a taskinfo and update
>> + ** the counters
>> + */
>> +
>> + // connect to thread group and update
>> + if (sock2task('g', sip, &sip->tgp, &sip->tgh,
>> + skb, ndev, in_syscall, direction)) {
>> + // connect to thread and update
>> + (void) sock2task('t', sip, &sip->thp, &sip->thh,
>> + skb, ndev, in_syscall, direction);
>> + }
>> +
>> +unlocks:
>> + spin_unlock_irqrestore(&shash[bs].lock, sflags);
>> +}
>> +
>> +/*
>> +** connect the sockinfo to the correct taskinfo and update the counters
>> +*/
>> +static int
>> +sock2task(char idtype, struct sockinfo *sip, struct taskinfo **tipp,
>> + short *hash, struct sk_buff *skb, const struct net_device *ndev,
>> + int in_syscall, char direction)
>> +{
>> + pid_t curid;
>> + unsigned long tflags;
>> +
>> + if (*tipp == NULL) {
>> + /*
>> + ** no taskinfo connected yet for this reference from
>> + ** sockinfo; to connect to a taskinfo, we must
>> + ** be in system call handling now --> verify
>> + */
>> + if (!in_syscall) {
>> + if (idtype == 'g')
>> + update_sockcounters(skb, ndev, sip, direction);
>> +
>> + return 0; // failed
>> + }
>> +
>> + /*
>> + ** try to find existing taskinfo or create new taskinfo
>> + */
>> + curid = (idtype == 'g' ? current->tgid : current->pid);
>> +
>> + *hash = THASH(curid, idtype); // calc hashQ
>> +
>> + spin_lock_irqsave(&thash[*hash].lock, tflags);
>> +
>> + if ( (*tipp = get_taskinfo(curid, idtype)) == NULL) {
>> + /*
>> + ** not possible to connect
>> + */
>> + spin_unlock_irqrestore(&thash[*hash].lock, tflags);
>> +
>> + if (idtype == 'g')
>> + update_sockcounters(skb, ndev, sip, direction);
>> +
>> + return 0; // failed
>> + }
>> +
>> + /*
>> + ** new connection made:
>> + ** update task counters with sock counters
>> + */
>> + sock2task_sync(skb, sip, *tipp);
>> + } else {
>> + /*
>> + ** already related to thread group or thread
>> + ** lock existing task
>> + */
>> + spin_lock_irqsave(&thash[*hash].lock, tflags);
>> +
>> + /*
>> + ** check if socket has been passed to another process in the
>> + ** meantime, like programs as xinetd use to do
>> + ** if so, connect sockinfo to the new task
>> + */
>> + if (in_syscall) {
>> + curid = (idtype == 'g' ? current->tgid : current->pid);
>> +
>> + if ((*tipp)->id != curid) {
>> + spin_unlock_irqrestore(&thash[*hash].lock,
>> + tflags);
>> + *hash = THASH(curid, idtype);
>> +
>> + spin_lock_irqsave(&thash[*hash].lock, tflags);
>> +
>> + if ( (*tipp = get_taskinfo(curid, idtype))
>> + == NULL) {
>> + spin_unlock_irqrestore(
>> + &thash[*hash].lock, tflags);
>> + return 0;
>> + }
>> + }
>> + }
>> + }
>> +
>> + update_taskcounters(skb, ndev, *tipp, direction);
>> +
>> + spin_unlock_irqrestore(&thash[*hash].lock, tflags);
>> +
>> + return 1;
>> +}
>> +
>> +/*
>> +** update the statistics of a particular thread group or thread
>> +*/
>> +static void
>> +update_taskcounters(struct sk_buff *skb, const struct net_device *ndev,
>> + struct taskinfo *tip, char direction)
>> +{
>> + struct iphdr *iph = (struct iphdr *)skb_network_header(skb);
>> + int reallen = calc_reallen(skb, ndev);
>> +
>> + switch (iph->protocol) {
>> + case IPPROTO_TCP:
>> + if (direction == 'i') {
>> + tip->tc.tcprcvpacks++;
>> + tip->tc.tcprcvbytes += reallen;
>> + } else {
>> + tip->tc.tcpsndpacks++;
>> + tip->tc.tcpsndbytes += reallen;
>> + }
>> + break;
>> +
>> + case IPPROTO_UDP:
>> + if (direction == 'i') {
>> + tip->tc.udprcvpacks++;
>> + tip->tc.udprcvbytes += reallen;
>> + } else {
>> + tip->tc.udpsndpacks++;
>> + tip->tc.udpsndbytes += reallen;
>> + }
>> + }
>> +}
>> +
>> +/*
>> +** update the statistics of a sockinfo without a connected task
>> +*/
>> +static void
>> +update_sockcounters(struct sk_buff *skb, const struct net_device *ndev,
>> + struct sockinfo *sip, char direction)
>> +{
>> + int reallen = calc_reallen(skb, ndev);
>> +
>> + if (direction == 'i') {
>> + sip->rcvpacks++;
>> + sip->rcvbytes += reallen;
>> + } else {
>> + sip->sndpacks++;
>> + sip->sndbytes += reallen;
>> + }
>> +}
>> +
>> +/*
>> +** add the temporary counters in the sockinfo to the new connected task
>> +*/
>> +static void
>> +sock2task_sync(struct sk_buff *skb, struct sockinfo *sip, struct
>> taskinfo *tip)
>> +{
>> + struct iphdr *iph = (struct iphdr *)skb_network_header(skb);
>> +
>> + switch (iph->protocol) {
>> + case IPPROTO_TCP:
>> + tip->tc.tcprcvpacks += sip->rcvpacks;
>> + tip->tc.tcprcvbytes += sip->rcvbytes;
>> + tip->tc.tcpsndpacks += sip->sndpacks;
>> + tip->tc.tcpsndbytes += sip->sndbytes;
>> + break;
>> +
>> + case IPPROTO_UDP:
>> + tip->tc.udprcvpacks += sip->rcvpacks;
>> + tip->tc.udprcvbytes += sip->rcvbytes;
>> + tip->tc.udpsndpacks += sip->sndpacks;
>> + tip->tc.udpsndbytes += sip->sndbytes;
>> + }
>> +}
>> +
>> +static void
>> +register_unident(struct sockinfo *sip)
>> +{
>> + switch (sip->proto) {
>> + case IPPROTO_TCP:
>> + unidenttcprcvpacks += sip->rcvpacks;
>> + unidenttcpsndpacks += sip->sndpacks;
>> + break;
>> +
>> + case IPPROTO_UDP:
>> + unidentudprcvpacks += sip->rcvpacks;
>> + unidentudpsndpacks += sip->sndpacks;
>> + }
>> +}
>> +
>> +/*
>> +** calculate the number of bytes that are really sent or received
>> +*/
>> +static int
>> +calc_reallen(struct sk_buff *skb, const struct net_device *ndev)
>> +{
>> + /*
>> + ** calculate the real load of this packet on the network:
>> + **
>> + ** - length of IP header, TCP/UDP header and data (skb->len)
>> + **
>> + ** since packet assembly/disassembly is done by the IP layer
>> + ** (we get an input packet that has been assembled already and
>> + ** an output packet that still has to be disassembled),
>> additional
>> + ** IP headers/interface headers and interface headers have
>> + ** to be calculated for packets that are larger than the mtu
>> + **
>> + ** - interface header length + 4 bytes crc
>> + */
>> + int reallen = skb->len;
>> +
>> + if (reallen > ndev->mtu)
>> + reallen += (reallen / ndev->mtu) *
>> + (sizeof(struct iphdr) + ndev->hard_header_len + 4);
>> +
>> + reallen += ndev->hard_header_len + 4;
>> +
>> + return reallen;
>> +}
>> +
>> +/*
>> +** find the tcpv4_ident for the current packet, represented by
>> +** the skb_buff
>> +*/
>> +static void
>> +get_tcpv4_ident(struct iphdr *iph, void *trh, char direction, union
>> keydef *key)
>> +{
>> + struct tcphdr *tcph = (struct tcphdr *)trh;
>> +
>> + memset(key, 0, sizeof *key); // important for memcmp later on
>> +
>> + /*
>> + ** determine local/remote IP address and
>> + ** determine local/remote port number
>> + */
>> + switch (direction) {
>> + case 'i': // incoming packet
>> + key->tcp4.laddr = ntohl(iph->daddr);
>> + key->tcp4.raddr = ntohl(iph->saddr);
>> + key->tcp4.lport = ntohs(tcph->dest);
>> + key->tcp4.rport = ntohs(tcph->source);
>> + break;
>> +
>> + case 'o': // outgoing packet
>> + key->tcp4.laddr = ntohl(iph->saddr);
>> + key->tcp4.raddr = ntohl(iph->daddr);
>> + key->tcp4.lport = ntohs(tcph->source);
>> + key->tcp4.rport = ntohs(tcph->dest);
>> + }
>> +}
>> +
>> +/*
>> +** search for the sockinfo holding the given address info
>> +** the appropriate hash bucket must have been locked before calling
>> +*/
>> +static struct sockinfo *
>> +find_sockinfo(int proto, union keydef *identp, int identsz, int hash)
>> +{
>> + struct sockinfo *sip = shash[hash].ch.next;
>> +
>> + /*
>> + ** search for appropriate struct
>> + */
>> + while (sip != (void *)&shash[hash].ch) {
>> + if ( memcmp(&sip->key, identp, identsz) == 0 &&
>> + sip->proto == proto) {
>> + sip->lastact = jiffies_64;
>> + return sip;
>> + }
>> +
>> + sip = sip->ch.next;
>> + }
>> +
>> + return NULL; // not existing
>> +}
>> +
>> +/*
>> +** create a new sockinfo and fill
>> +** the appropriate hash bucket must have been locked before calling
>> +*/
>> +static struct sockinfo *
>> +make_sockinfo(int proto, union keydef *identp, int identsz, int hash)
>> +{
>> + struct sockinfo *sip;
>> + unsigned long flags;
>> +
>> + /*
>> + ** check if the threshold of memory used for sockinfo structs
>> + ** is reached to avoid that a fork bomb of processes opening
>> + ** a socket leads to memory overload
>> + */
>> + if ( (nrs+1) * sizeof(struct sockinfo) > SILIMIT) {
>> + spin_lock_irqsave(&nrslock, flags);
>> + nrs_ovf++;
>> + spin_unlock_irqrestore(&nrslock, flags);
>> + return NULL;
>> + }
>> +
>> + if ( (sip = kmem_cache_alloc(sicache, GFP_ATOMIC)) == NULL)
>> + return NULL;
>> +
>> + spin_lock_irqsave(&nrslock, flags);
>> + nrs++;
>> + spin_unlock_irqrestore(&nrslock, flags);
>> +
>> + /*
>> + ** insert new struct in doubly linked list
>> + */
>> + memset(sip, '\0', sizeof *sip);
>> +
>> + sip->ch.next = &shash[hash].ch;
>> + sip->ch.prev = shash[hash].ch.prev;
>> + ((struct sockinfo *)shash[hash].ch.prev)->ch.next = sip;
>> + shash[hash].ch.prev = sip;
>> +
>> + sip->proto = proto;
>> + sip->lastact = jiffies_64;
>> + sip->key = *identp;
>> +
>> + return sip;
>> +}
>> +
>> +/*
>> +** search the taskinfo structure holding the info about the given
>> id/type
>> +** if such taskinfo is not yet present, create a new one
>> +*/
>> +static struct taskinfo *
>> +get_taskinfo(pid_t id, char type)
>> +{
>> + int bt = THASH(id, type);
>> + struct taskinfo *tip = thash[bt].ch.next;
>> + unsigned long tflags;
>> +
>> + /*
>> + ** search if id exists already
>> + */
>> + while (tip != (void *)&thash[bt].ch) {
>> + if (tip->id == id && tip->type == type)
>> + return tip;
>> +
>> + tip = tip->ch.next;
>> + }
>> +
>> + /*
>> + ** check if the threshold of memory used for taskinfo structs
>> + ** is reached to avoid that a fork bomb of processes opening
>> + ** a socket lead to memory overload
>> + */
>> + if ( (nre+nrt+1) * sizeof(struct taskinfo) > TILIMIT) {
>> + spin_lock_irqsave(&nrtlock, tflags);
>> + nrt_ovf++;
>> + spin_unlock_irqrestore(&nrtlock, tflags);
>> + return NULL;
>> + }
>> +
>> + /*
>> + ** id not known yet
>> + ** add new entry to hash list
>> + */
>> + if ( (tip = kmem_cache_alloc(ticache, GFP_ATOMIC)) == NULL)
>> + return NULL;
>> +
>> + spin_lock_irqsave(&nrtlock, tflags);
>> + nrt++;
>> + spin_unlock_irqrestore(&nrtlock, tflags);
>> +
>> + /*
>> + ** insert new struct in doubly linked list
>> + ** and fill values
>> + */
>> + memset(tip, '\0', sizeof *tip);
>> +
>> + tip->ch.next = &thash[bt].ch;
>> + tip->ch.prev = thash[bt].ch.prev;
>> + ((struct taskinfo *)thash[bt].ch.prev)->ch.next = tip;
>> + thash[bt].ch.prev = tip;
>> +
>> + tip->id = id;
>> + tip->type = type;
>> +
>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 17, 0)
>> + tip->btime = div_u64((current->real_start_time +
>> + (boottime.tv_sec * NSEC_PER_SEC +
>> + boottime.tv_sec)), NSEC_PER_SEC);
>> +#else
>> + // current->real_start_time is type u64
>> + tip->btime = current->real_start_time.tv_sec + boottime.tv_sec;
>> +
>> + if (current->real_start_time.tv_nsec + boottime.tv_nsec >
>> NSEC_PER_SEC)
>> + tip->btime++;
>> +#endif
>> +
>> + strncpy(tip->command, current->comm, COMLEN);
>> +
>> + return tip;
>> +}
>> +
>> +/*
>> +** garbage collector that removes:
>> +** - exited tasks that are no longer used by user mode programs
>> +** - sockinfo's that are not used any more
>> +** - taskinfo's that do not exist any more
>> +**
>> +** a mutex avoids that the garbage collector runs several times in
>> parallel
>> +**
>> +** this function may only be called in process context!
>> +*/
>> +static void
>> +garbage_collector(void)
>> +{
>> + mutex_lock(&gclock);
>> +
>> + if (jiffies_64 < gclast + (HZ/2)) { // maximum 2 GC cycles per
>> second
>> + mutex_unlock(&gclock);
>> + return;
>> + }
>> +
>> + gctaskexit(); // remove remaining taskinfo structs from exit list
>> +
>> + gcsockinfo(); // clean up sockinfo structs in shash list
>> +
>> + gctaskinfo(); // clean up taskinfo structs in thash list
>> +
>> + gclast = jiffies_64;
>> +
>> + mutex_unlock(&gclock);
>> +}
>> +
>> +/*
>> +** tasks in the exitlist can be read by a user mode process for a
>> limited
>> +** amount of time; this function removes all taskinfo structures that
>> have
>> +** not been read within that period of time
>> +** notice that exited processes are chained to the tail, so the oldest
>> +** can be found at the head
>> +*/
>> +static void
>> +gctaskexit()
>> +{
>> + unsigned long flags;
>> + struct taskinfo *tip;
>> +
>> + spin_lock_irqsave(&exitlock, flags);
>> +
>> + for (tip=exithead; tip;) {
>> + if (jiffies_64 < tip->exittime + GCINTERVAL)
>> + break;
>> +
>> + // remove taskinfo from exitlist
>> + exithead = tip->ch.next;
>> + kmem_cache_free(ticache, tip);
>> + nre--;
>> + tip = exithead;
>> + }
>> +
>> + /*
>> + ** if list empty now, then exithead and exittail both NULL
>> + ** wakeup waiters for emptylist
>> + */
>> + if (nre == 0) {
>> + exittail = NULL;
>> + wake_up_interruptible(&exitlist_empty);
>> + }
>> +
>> + spin_unlock_irqrestore(&exitlock, flags);
>> +}
>> +
>> +/*
>> +** cleanup sockinfo structures that are connected to finished processes
>> +*/
>> +static void
>> +gcsockinfo()
>> +{
>> + int i;
>> + struct sockinfo *sip, *sipsave;
>> + unsigned long sflags, tflags;
>> + struct pid *pid;
>> +
>> + /*
>> + ** go through all sockinfo hash buckets
>> + */
>> + for (i=0; i < SBUCKS; i++) {
>> + if (shash[i].ch.next == (void *)&shash[i].ch)
>> + continue; // quick return without lock
>> +
>> + spin_lock_irqsave(&shash[i].lock, sflags);
>> +
>> + sip = shash[i].ch.next;
>> +
>> + /*
>> + ** search all sockinfo structs chained in one bucket
>> + */
>> + while (sip != (void *)&shash[i].ch) {
>> + /*
>> + ** TCP connections that were not in
>> + ** state ESTABLISHED or LISTEN can be
>> + ** eliminated
>> + */
>> + if (sip->proto == IPPROTO_TCP) {
>> + switch (sip->last_state) {
>> + case TCP_ESTABLISHED:
>> + case TCP_LISTEN:
>> + break;
>> +
>> + default:
>> + sipsave = sip->ch.next;
>> + delete_sockinfo(sip);
>> + sip = sipsave;
>> + continue;
>> + }
>> + }
>> +
>> + /*
>> + ** check if this sockinfo has no relation
>> + ** for a while with a thread group
>> + ** if so, delete the sockinfo
>> + */
>> + if (sip->tgp == NULL) {
>> + if (sip->lastact + GCMAXUNREF < jiffies_64) {
>> + register_unident(sip);
>> + sipsave = sip->ch.next;
>> + delete_sockinfo(sip);
>> + sip = sipsave;
>> + } else {
>> + sip = sip->ch.next;
>> + }
>> + continue;
>> + }
>> +
>> + /*
>> + ** check if referred thread group is
>> + ** already marked as 'indelete' during this
>> + ** sockinfo search
>> + ** if so, delete this sockinfo
>> + */
>> + spin_lock_irqsave(&thash[sip->tgh].lock, tflags);
>> +
>> + if (sip->tgp->state == INDELETE) {
>> + spin_unlock_irqrestore(&thash[sip->tgh].lock,
>> + tflags);
>> + sipsave = sip->ch.next;
>> + delete_sockinfo(sip);
>> + sip = sipsave;
>> + continue;
>> + }
>> +
>> + /*
>> + ** check if referred thread group still exists;
>> + ** this step will be skipped if we already verified
>> + ** the existance of the thread group earlier during
>> + ** this garbage collection cycle
>> + */
>> + if (sip->tgp->state != CHECKED) {
>> + /*
>> + ** connected thread group not yet verified
>> + ** during this cycle, so check if it still
>> + ** exists
>> + ** if not, mark the thread group as 'indelete'
>> + ** (it can not be deleted right now because
>> + ** we might find other sockinfo's referring
>> + ** to this thread group during the current
>> + ** cycle) and delete this sockinfo
>> + ** if the thread group exists, just mark
>> + ** it as 'checked' for this cycle
>> + */
>> + rcu_read_lock();
>> + pid = find_vpid(sip->tgp->id);
>> + rcu_read_unlock();
>> +
>> + if (pid == NULL) {
>> + sip->tgp->state = INDELETE;
>> + spin_unlock_irqrestore(
>> + &thash[sip->tgh].lock, tflags);
>> +
>> + sipsave = sip->ch.next;
>> + delete_sockinfo(sip);
>> + sip = sipsave;
>> + continue;
>> + } else {
>> + sip->tgp->state = CHECKED;
>> + }
>> + }
>> +
>> + spin_unlock_irqrestore(&thash[sip->tgh].lock, tflags);
>> +
>> + /*
>> + ** check if this sockinfo has a relation with a thread
>> + ** if not, skip further handling of this sockinfo
>> + */
>> + if (sip->thp == NULL) {
>> + sip = sip->ch.next;
>> + continue;
>> + }
>> +
>> + /*
>> + ** check if referred thread is already marked
>> + ** as 'indelete' during this sockinfo search
>> + ** if so, break connection
>> + */
>> + spin_lock_irqsave(&thash[sip->thh].lock, tflags);
>> +
>> + if (sip->thp->state == INDELETE) {
>> + spin_unlock_irqrestore(&thash[sip->thh].lock,
>> + tflags);
>> + sip->thp = NULL;
>> + sip = sip->ch.next;
>> + continue;
>> + }
>> +
>> + /*
>> + ** check if referred thread is already checked
>> + ** during this sockinfo search
>> + */
>> + if (sip->thp->state == CHECKED) {
>> + spin_unlock_irqrestore(&thash[sip->thh].lock,
>> + tflags);
>> + sip = sip->ch.next;
>> + continue;
>> + }
>> +
>> + /*
>> + ** connected thread not yet verified
>> + ** check if it still exists
>> + ** if not, mark it as 'indelete' and break connection
>> + ** if thread exists, mark it 'checked'
>> + */
>> + rcu_read_lock();
>> + pid = find_vpid(sip->thp->id);
>> + rcu_read_unlock();
>> +
>> + if (pid == NULL) {
>> + sip->thp->state = INDELETE;
>> + sip->thp = NULL;
>> + } else {
>> + sip->thp->state = CHECKED;
>> + }
>> +
>> + spin_unlock_irqrestore(&thash[sip->thh].lock, tflags);
>> +
>> + /*
>> + ** check if a TCP port has not been used
>> + ** for some time --> destroy even if the thread
>> + ** (group) is still there
>> + */
>> + if (sip->proto == IPPROTO_TCP &&
>> + sip->lastact + GCMAXTCP < jiffies_64) {
>> + sipsave = sip->ch.next;
>> + delete_sockinfo(sip);
>> + sip = sipsave;
>> + continue;
>> + }
>> +
>> + /*
>> + ** check if a UDP port has not been used
>> + ** for some time --> destroy even if the thread
>> + ** (group) is still there
>> + ** e.g. outgoing DNS requests (to remote port 53) are
>> + ** issued every time with another source port being
>> + ** a new object that should not be kept too long;
>> + ** local well-known ports are useful to keep
>> + */
>> + if (sip->proto == IPPROTO_UDP &&
>> + sip->lastact + GCMAXUDP < jiffies_64 &&
>> + sip->key.udp > 1024) {
>> + sipsave = sip->ch.next;
>> + delete_sockinfo(sip);
>> + sip = sipsave;
>> + continue;
>> + }
>> +
>> + sip = sip->ch.next;
>> + }
>> +
>> + spin_unlock_irqrestore(&shash[i].lock, sflags);
>> + }
>> +}
>> +
>> +/*
>> +** remove taskinfo structures of finished tasks from hash list
>> +*/
>> +static void
>> +gctaskinfo()
>> +{
>> + int i;
>> + struct taskinfo *tip, *tipsave;
>> + unsigned long tflags;
>> + struct pid *pid;
>> +
>> + /*
>> + ** go through all taskinfo hash buckets
>> + */
>> + for (i=0; i < TBUCKS; i++) {
>> + if (thash[i].ch.next == (void *)&thash[i].ch)
>> + continue; // quick return without lock
>> +
>> + spin_lock_irqsave(&thash[i].lock, tflags);
>> +
>> + tip = thash[i].ch.next;
>> +
>> + /*
>> + ** check all taskinfo structs chained to this bucket
>> + */
>> + while (tip != (void *)&thash[i].ch) {
>> + switch (tip->state) {
>> + /*
>> + ** remove INDELETE tasks from the hash buckets
>> + ** -- move thread group to exitlist
>> + ** -- destroy thread right away
>> + */
>> + case INDELETE:
>> + tipsave = tip->ch.next;
>> +
>> + if (tip->type == 'g')
>> + move_taskinfo(tip); // thread group
>> + else
>> + delete_taskinfo(tip); // thread
>> +
>> + tip = tipsave;
>> + break;
>> +
>> + case CHECKED:
>> + tip->state = 0;
>> + tip = tip->ch.next;
>> + break;
>> +
>> + default: // not checked yet
>> + rcu_read_lock();
>> + pid = find_vpid(tip->id);
>> + rcu_read_unlock();
>> +
>> + if (pid == NULL) {
>> + tipsave = tip->ch.next;
>> +
>> + if (tip->type == 'g')
>> + move_taskinfo(tip);
>> + else
>> + delete_taskinfo(tip);
>> +
>> + tip = tipsave;
>> + } else {
>> + tip = tip->ch.next;
>> + }
>> + }
>> + }
>> +
>> + spin_unlock_irqrestore(&thash[i].lock, tflags);
>> + }
>> +}
>> +
>> +
>> +/*
>> +** remove all sockinfo structs
>> +*/
>> +static void
>> +wipesockinfo()
>> +{
>> + struct sockinfo *sip, *sipsave;
>> + int i;
>> + unsigned long sflags;
>> +
>> + for (i=0; i < SBUCKS; i++) {
>> + spin_lock_irqsave(&shash[i].lock, sflags);
>> +
>> + sip = shash[i].ch.next;
>> +
>> + /*
>> + ** free all structs chained in one bucket
>> + */
>> + while (sip != (void *)&shash[i].ch) {
>> + sipsave = sip->ch.next;
>> + delete_sockinfo(sip);
>> + sip = sipsave;
>> + }
>> +
>> + spin_unlock_irqrestore(&shash[i].lock, sflags);
>> + }
>> +}
>> +
>> +/*
>> +** remove all taskinfo structs from hash list
>> +*/
>> +static void
>> +wipetaskinfo()
>> +{
>> + struct taskinfo *tip, *tipsave;
>> + int i;
>> + unsigned long tflags;
>> +
>> + for (i=0; i < TBUCKS; i++) {
>> + spin_lock_irqsave(&thash[i].lock, tflags);
>> +
>> + tip = thash[i].ch.next;
>> +
>> + /*
>> + ** free all structs chained in one bucket
>> + */
>> + while (tip != (void *)&thash[i].ch) {
>> + tipsave = tip->ch.next;
>> + delete_taskinfo(tip);
>> + tip = tipsave;
>> + }
>> +
>> + spin_unlock_irqrestore(&thash[i].lock, tflags);
>> + }
>> +}
>> +
>> +/*
>> +** remove all taskinfo structs from exit list
>> +*/
>> +static void
>> +wipetaskexit()
>> +{
>> + gctaskexit();
>> +}
>> +
>> +/*
>> +** move one taskinfo struct from hash bucket to exitlist
>> +*/
>> +static void
>> +move_taskinfo(struct taskinfo *tip)
>> +{
>> + unsigned long flags;
>> +
>> + /*
>> + ** remove from hash list
>> + */
>> + ((struct taskinfo *)tip->ch.next)->ch.prev = tip->ch.prev;
>> + ((struct taskinfo *)tip->ch.prev)->ch.next = tip->ch.next;
>> +
>> + spin_lock_irqsave(&nrtlock, flags);
>> + nrt--;
>> + spin_unlock_irqrestore(&nrtlock, flags);
>> +
>> + /*
>> + ** add to exitlist
>> + */
>> + tip->ch.next = NULL;
>> + tip->state = FINISHED;
>> + tip->exittime = jiffies_64;
>> +
>> + spin_lock_irqsave(&exitlock, flags);
>> +
>> + if (exittail) { // list filled?
>> + exittail->ch.next = tip;
>> + exittail = tip;
>> + } else { // list empty
>> + exithead = exittail = tip;
>> + }
>> +
>> + nre++;
>> +
>> + wake_up_interruptible(&exitlist_filled);
>> +
>> + spin_unlock_irqrestore(&exitlock, flags);
>> +}
>> +
>> +/*
>> +** remove one taskinfo struct for the hash bucket chain
>> +*/
>> +static void
>> +delete_taskinfo(struct taskinfo *tip)
>> +{
>> + unsigned long flags;
>> +
>> + ((struct taskinfo *)tip->ch.next)->ch.prev = tip->ch.prev;
>> + ((struct taskinfo *)tip->ch.prev)->ch.next = tip->ch.next;
>> +
>> + kmem_cache_free(ticache, tip);
>> +
>> + spin_lock_irqsave(&nrtlock, flags);
>> + nrt--;
>> + spin_unlock_irqrestore(&nrtlock, flags);
>> +}
>> +
>> +/*
>> +** remove one sockinfo struct for the hash bucket chain
>> +*/
>> +static void
>> +delete_sockinfo(struct sockinfo *sip)
>> +{
>> + unsigned long flags;
>> +
>> + ((struct sockinfo *)sip->ch.next)->ch.prev = sip->ch.prev;
>> + ((struct sockinfo *)sip->ch.prev)->ch.next = sip->ch.next;
>> +
>> + kmem_cache_free(sicache, sip);
>> +
>> + spin_lock_irqsave(&nrslock, flags);
>> + nrs--;
>> + spin_unlock_irqrestore(&nrslock, flags);
>> +}
>> +
>> +/*
>> +** read function for /proc/netatop
>> +*/
>> +static int
>> +netatop_show(struct seq_file *m, void *v)
>> +{
>> + seq_printf(m, "tcpsndpacks: %12lu (unident: %9lu)\n"
>> + "tcprcvpacks: %12lu (unident: %9lu)\n"
>> + "udpsndpacks: %12lu (unident: %9lu)\n"
>> + "udprcvpacks: %12lu (unident: %9lu)\n\n"
>> + "icmpsndpacks: %12lu\n"
>> + "icmprcvpacks: %12lu\n\n"
>> + "#sockinfo: %12lu (overflow: %8lu)\n"
>> + "#taskinfo: %12lu (overflow: %8lu)\n"
>> + "#taskexit: %12lu\n\n"
>> + "modversion: %14s\n",
>> + tcpsndpacks, unidenttcpsndpacks,
>> + tcprcvpacks, unidenttcprcvpacks,
>> + udpsndpacks, unidentudpsndpacks,
>> + udprcvpacks, unidentudprcvpacks,
>> + icmpsndpacks, icmprcvpacks,
>> + nrs, nrs_ovf,
>> + nrt, nrt_ovf,
>> + nre, NETATOPVERSION);
>> + return 0;
>> +}
>> +
>> +static int
>> +netatop_open(struct inode *inode, struct file *file)
>> +{
>> + return single_open(file, netatop_show, NULL);
>> +}
>> +
>> +/*
>> +** called when user spce issues system call getsockopt()
>> +*/
>> +static int
>> +getsockopt(struct sock *sk, int cmd, void __user *user, int *len)
>> +{
>> + int bt;
>> + struct taskinfo *tip;
>> + char tasktype = 't';
>> + struct netpertask npt;
>> + unsigned long tflags;
>> +
>> + /*
>> + ** verify the proper privileges
>> + */
>> + if (!capable(CAP_NET_ADMIN))
>> + return -EPERM;
>> +
>> + /*
>> + ** react on command
>> + */
>> + switch (cmd) {
>> + case NETATOP_PROBE:
>> + break;
>> +
>> + case NETATOP_FORCE_GC:
>> + garbage_collector();
>> + break;
>> +
>> + case NETATOP_EMPTY_EXIT:
>> + while (nre > 0) {
>> + if (wait_event_interruptible(exitlist_empty, nre == 0))
>> + return -ERESTARTSYS;
>> + }
>> + break;
>> +
>> + case NETATOP_GETCNT_EXIT:
>> + if (nre == 0)
>> + wake_up_interruptible(&exitlist_empty);
>> +
>> + if (*len < sizeof(pid_t))
>> + return -EINVAL;
>> +
>> + if (*len > sizeof npt)
>> + *len = sizeof npt;
>> +
>> + spin_lock_irqsave(&exitlock, tflags);
>> +
>> + /*
>> + ** check if an exited process is present
>> + ** if not, wait for it...
>> + */
>> + while (nre == 0) {
>> + spin_unlock_irqrestore(&exitlock, tflags);
>> +
>> + if ( wait_event_interruptible(exitlist_filled, nre > 0))
>> + return -ERESTARTSYS;
>> +
>> + spin_lock_irqsave(&exitlock, tflags);
>> + }
>> +
>> + /*
>> + ** get first eprocess from exitlist and remove it from there
>> + */
>> + tip = exithead;
>> +
>> + if ( (exithead = tip->ch.next) == NULL)
>> + exittail = NULL;
>> +
>> + nre--;
>> +
>> + spin_unlock_irqrestore(&exitlock, tflags);
>> +
>> + /*
>> + ** pass relevant info to user mode
>> + ** and free taskinfo struct
>> + */
>> + npt.id = tip->id;
>> + npt.tc = tip->tc;
>> + npt.btime = tip->btime;
>> + memcpy(npt.command, tip->command, COMLEN);
>> +
>> + if (copy_to_user(user, &npt, *len) != 0)
>> + return -EFAULT;
>> +
>> + kmem_cache_free(ticache, tip);
>> +
>> + return 0;
>> +
>> + case NETATOP_GETCNT_TGID:
>> + tasktype = 'g';
>> +
>> + case NETATOP_GETCNT_PID:
>> + if (*len < sizeof(pid_t))
>> + return -EINVAL;
>> +
>> + if (*len > sizeof npt)
>> + *len = sizeof npt;
>> +
>> + if (copy_from_user(&npt, user, *len) != 0)
>> + return -EFAULT;
>> +
>> + /*
>> + ** search requested id in taskinfo hash
>> + */
>> + bt = THASH(npt.id, tasktype); // calculate hash
>> +
>> + if (thash[bt].ch.next == (void *)&thash[bt].ch)
>> + return -ESRCH; // quick return without lock
>> +
>> + spin_lock_irqsave(&thash[bt].lock, tflags);
>> +
>> + tip = thash[bt].ch.next;
>> +
>> + while (tip != (void *)&thash[bt].ch) {
>> + // is this the one?
>> + if (tip->id == npt.id && tip->type == tasktype) {
>> + /*
>> + ** found: copy results to user space
>> + */
>> + memcpy(npt.command, tip->command, COMLEN);
>> + npt.tc = tip->tc;
>> + npt.btime = tip->btime;
>> +
>> + spin_unlock_irqrestore(&thash[bt].lock, tflags);
>> +
>> + if (copy_to_user(user, &npt, *len) != 0)
>> + return -EFAULT;
>> + else
>> + return 0;
>> + }
>> +
>> + tip = tip->ch.next;
>> + }
>> +
>> + spin_unlock_irqrestore(&thash[bt].lock, tflags);
>> + return -ESRCH;
>> +
>> + default:
>> + printk(KERN_INFO "unknown getsockopt command %d\n", cmd);
>> + return -EINVAL;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +/*
>> +** kernel mode thread: initiate garbage collection every N seconds
>> +*/
>> +static int netatop_thread(void *dummy)
>> +{
>> + while (!kthread_should_stop()) {
>> + /*
>> + ** do garbage collection
>> + */
>> + garbage_collector();
>> +
>> + /*
>> + ** wait a while
>> + */
>> + (void) schedule_timeout_interruptible(GCINTERVAL);
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 13, 0)
>> +static int __net_init netatop_nf_register(struct net *net)
>> +{
>> + int err;
>> +
>> + err = nf_register_net_hooks(net, &hookin_ipv4, 1);
>> + if (err)
>> + return err;
>> +
>> + err = nf_register_net_hooks(net, &hookout_ipv4, 1);
>> + if (err)
>> + nf_unregister_net_hooks(net, &hookin_ipv4, 1);
>> +
>> + return err;
>> +}
>> +
>> +static void __net_exit netatop_nf_unregister(struct net *net)
>> +{
>> + nf_unregister_net_hooks(net, &hookin_ipv4, 1);
>> + nf_unregister_net_hooks(net, &hookout_ipv4, 1);
>> +}
>> +
>> +static struct pernet_operations netatop_net_ops = {
>> + .init = netatop_nf_register,
>> + .exit = netatop_nf_unregister,
>> +};
>> +#endif
>> +
>> +/*
>> +** called when module loaded
>> +*/
>> +int
>> +init_module()
>> +{
>> + int i;
>> +
>> + /*
>> + ** initialize caches for taskinfo and sockinfo
>> + */
>> + ticache = kmem_cache_create("Netatop_taskinfo",
>> + sizeof (struct taskinfo), 0, 0, NULL);
>> + if (!ticache)
>> + return -EFAULT;
>> +
>> + sicache = kmem_cache_create("Netatop_sockinfo",
>> + sizeof (struct sockinfo), 0, 0, NULL);
>> + if (!sicache) {
>> + kmem_cache_destroy(ticache);
>> + return -EFAULT;
>> + }
>> +
>> + /*
>> + ** initialize hash table for taskinfo and sockinfo
>> + */
>> + for (i=0; i < TBUCKS; i++) {
>> + thash[i].ch.next = &thash[i].ch;
>> + thash[i].ch.prev = &thash[i].ch;
>> + spin_lock_init(&thash[i].lock);
>> + }
>> +
>> + for (i=0; i < SBUCKS; i++) {
>> + shash[i].ch.next = &shash[i].ch;
>> + shash[i].ch.prev = &shash[i].ch;
>> + spin_lock_init(&shash[i].lock);
>> + }
>> +
>> + getboottime(&boottime);
>> +
>> + /*
>> + ** register getsockopt for user space communication
>> + */
>> + if (nf_register_sockopt(&sockopts) < 0) {
>> + kmem_cache_destroy(ticache);
>> + kmem_cache_destroy(sicache);
>> + return -1;
>> + }
>> +
>> + /*
>> + ** create a new kernel mode thread for time-driven garbage
>> collection
>> + ** after creation, the thread waits until it is woken up
>> + */
>> + knetatop_task = kthread_create(netatop_thread, NULL, "knetatop");
>> +
>> + if (IS_ERR(knetatop_task)) {
>> + nf_unregister_sockopt(&sockopts);
>> + kmem_cache_destroy(ticache);
>> + kmem_cache_destroy(sicache);
>> + return -1;
>> + }
>> +
>> + /*
>> + ** prepare hooks for input and output packets
>> + */
>> + hookin_ipv4.hooknum = NF_IP_LOCAL_IN; // input packs
>> + hookin_ipv4.hook = ipv4_hookin; // func to call
>> + hookin_ipv4.pf = PF_INET; // IPV4 packets
>> + hookin_ipv4.priority = NF_IP_PRI_FIRST; // highest prio
>> +
>> + hookout_ipv4.hooknum = NF_IP_LOCAL_OUT; // output packs
>> + hookout_ipv4.hook = ipv4_hookout; // func to call
>> + hookout_ipv4.pf = PF_INET; // IPV4 packets
>> + hookout_ipv4.priority = NF_IP_PRI_FIRST; // highest prio
>> +
>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 13, 0)
>> + i = register_pernet_subsys(&netatop_net_ops);
>> + if (i)
>> + return i;
>> +#else
>> + nf_register_hook(&hookin_ipv4); // register hook
>> + nf_register_hook(&hookout_ipv4); // register hook
>> +#endif
>> +
>> + /*
>> + ** create a /proc-entry to produce status-info on request
>> + */
>> + proc_create("netatop", 0444, NULL, &netatop_proc_fops);
>> +
>> + /*
>> + ** all admi prepared; kick off kernel mode thread
>> + */
>> + wake_up_process(knetatop_task);
>> +
>> + return 0; // return success
>> +}
>> +
>> +/*
>> +** called when module unloaded
>> +*/
>> +void
>> +cleanup_module()
>> +{
>> + /*
>> + ** tell kernel daemon to stop
>> + */
>> + kthread_stop(knetatop_task);
>> +
>> + /*
>> + ** unregister netfilter hooks and other miscellaneous stuff
>> + */
>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 13, 0)
>> + unregister_pernet_subsys(&netatop_net_ops);
>> +#else
>> + nf_unregister_hook(&hookin_ipv4);
>> + nf_unregister_hook(&hookout_ipv4);
>> +#endif
>> +
>> + remove_proc_entry("netatop", NULL);
>> +
>> + nf_unregister_sockopt(&sockopts);
>> +
>> + /*
>> + ** destroy allocated stats
>> + */
>> + wipesockinfo();
>> + wipetaskinfo();
>> + wipetaskexit();
>> +
>> + /*
>> + ** destroy caches
>> + */
>> + kmem_cache_destroy(ticache);
>> + kmem_cache_destroy(sicache);
>> +}
>> +
>> +module_init(init_module);
>> +module_exit(cleanup_module);
>> diff --git a/drivers/staging/netatop/netatop.h
>> b/drivers/staging/netatop/netatop.h
>> new file mode 100644
>> index 000000000000..a14e6fa5db52
>> --- /dev/null
>> +++ b/drivers/staging/netatop/netatop.h
>> @@ -0,0 +1,47 @@
>> +#define COMLEN 16
>> +
>> +struct taskcount {
>> + unsigned long long tcpsndpacks;
>> + unsigned long long tcpsndbytes;
>> + unsigned long long tcprcvpacks;
>> + unsigned long long tcprcvbytes;
>> +
>> + unsigned long long udpsndpacks;
>> + unsigned long long udpsndbytes;
>> + unsigned long long udprcvpacks;
>> + unsigned long long udprcvbytes;
>> +
>> + /* space for future extensions */
>> +};
>> +
>> +struct netpertask {
>> + pid_t id; // tgid or tid (depending on command)
>> + unsigned long btime;
>> + char command[COMLEN];
>> +
>> + struct taskcount tc;
>> +};
>> +
>> +
>> +/*
>> +** getsocktop commands
>> +*/
>> +#define NETATOP_BASE_CTL 15661
>> +
>> +// just probe if the netatop module is active
>> +#define NETATOP_PROBE (NETATOP_BASE_CTL)
>> +
>> +// force garbage collection to make finished processes available
>> +#define NETATOP_FORCE_GC (NETATOP_BASE_CTL+1)
>> +
>> +// wait until all finished processes are read (blocks until done)
>> +#define NETATOP_EMPTY_EXIT (NETATOP_BASE_CTL+2)
>> +
>> +// get info for finished process (blocks until available)
>> +#define NETATOP_GETCNT_EXIT (NETATOP_BASE_CTL+3)
>> +
>> +// get counters for thread group (i.e. process): input is 'id' (pid)
>> +#define NETATOP_GETCNT_TGID (NETATOP_BASE_CTL+4)
>> +
>> +// get counters for thread: input is 'id' (tid)
>> +#define NETATOP_GETCNT_PID (NETATOP_BASE_CTL+5)
>> diff --git a/drivers/staging/netatop/netatopversion.h
>> b/drivers/staging/netatop/netatopversion.h
>> new file mode 100644
>> index 000000000000..eb011f5a13c9
>> --- /dev/null
>> +++ b/drivers/staging/netatop/netatopversion.h
>> @@ -0,0 +1,2 @@
>> +#define NETATOPVERSION "2.0"
>> +#define NETATOPDATE "2017/09/30 12:06:19"
>>
>
More information about the linux-yocto
mailing list