Wolfram Language

Analyze Relations between Email Senders

A sender graph of a mailbox consists of vertices denoting a single sender and edges denoting that two senders participate in the same conversation thread (whether or not one directly replies to the other). By weighting the edges by the number of threads in which the senders both appear, the graph illustrates senders that tend to participate in the same conversations. This example uses a mailing list archive taken from here.

Assuming the downloaded MBOX is stored in file, first create the conversation graph as discussed in a previous example.

show complete Wolfram Language input

Extract all senders from the "MBOX" to get the vertices of the sender graph.

Separate message threads and convert each one to a list of message IDs.

Convert each message ID to a sender address, deleting the duplicates in each thread.

Find sender pairs in each thread, returning them as a flat list of edges.

Create the graph from the computed vertices and and distinct edges.

Use VertexDegree to define a function that labels each vertex, making the size increase with the number of other users that participate in threads with the user.

show complete Wolfram Language input

Make the thickness of an edge proportional to the number of threads in which it appears.

Add these properties to the previous graph to get the complete sender graph.

Related Examples

de es fr ja ko pt-br zh