Importantly for Clique, the research programme is driven by a set of challenges identified by our three industrial partners. Our industrial partners provide access to voluminous data and also information about the characteristics of real data. On the biological side our collaborators in the UCD Conway Institute and in the Krogan Lab at the University of California-San Francisco, provide similar access and expertise. In the coming years, research in network data analysis will be transformed by access to large-scale dynamic data resources.
Our key research challenges can be summarised as follows:
Clique represents a research programme in network analysis that addresses issues in internet services, fraud detection and bioinformatics. Our specific work package details include:
WP 1: Dynamic Network Analysis
The fundamental observation that drives social network analysis and which connects all the research challenges is that relationships between data entities are critically important to understand the interesting characteristics and features of the data. Thus, network or graph representations of the data and graph theoretic tools to analyse these representations are critically important to this research programme. This work-package focuses on the development of fundamental network analysis tools while focussing on the dynamic nature of these networks.
WP 2: Probabilistic Network Models and Applications
An alternative to the matrix decomposition approach to network analysis addressed in WP1.1 are the probabilistic techniques coming from statistics. It is easier to handle complex forms of data in the probabilistic paradigm, but scaling these techniques to work with large networks is a challenge. This work package will focus on the development of network models, methods for network model inference and applications.
WP 3: Visualisation
The interactive visual representation of abstract data, to aid in human exploration and understanding of it, is a key research challenge in bioinformatics and social network analysis. Network Visualisation is concerend with the sourcing, management, layout, drawing, viewing and interaction with relational data. Visualisation relies on a human to guide the application of methods, structuring of queries and control of the interaction in the pursuit of understanding. The focus of this workpackage is on the development of the fundamental algorithms, methods and interaction techniques required to visualise large and dynamic networks possessing latent structure. Such structures include centrality, flow, communities or anomalous structure. Tasks in this work package are strongly correlated and support work undertaken in other work packages.
WP 4: Social Analytics
A number of fundamental questions have recently arisen about how the analysis of online social behaviour – social analytics – can be used to gauge the service functionality of the network and to predict its future efficacy. While social media networks are increasingly relied upon to deliver services such as customer relationship management, customer and technical support, dissemination of corporate knowledge and information updates, the human-centred network on which theses services rely is poorly understood. We may draw a useful analogy to the instrumentation, analysis and diagnostic procedures of the ‘packet’ networks where the fault tolerances for different types of Internet traffic and behaviour are understood and designed for. In contrast, we have sketchy ideas of what constitutes the limits of normative ‘functional’ behaviour and what aspects of the network should be measured to reveal it. What may be normative in one domain may be dysfunctional in another.
These observations suggest that there is fundamental work required in methods for instrumentation, diagnosis and an intervention pipeline for social network operators. We do not believe that it is possible to build a complete diagnostic model of social networks. Rather the goal of this work package is to explain the possible root-cause relationships between commonly occurring symptoms in social media networks and fundamental aspects of the network such as its structure, role composition and rate of change over time.
WP 5: Discovering Anomalous Structure
Discovering remarkable or anomalous structure is a key research challenge in the analysis of financial data and in recommender and opinion-based systems. In the analysis of biological networks the identification of unusual motifs that are the basic modules of information processing is of great interest. In social networks and in the social web the discovery of network structure that is in some way false or fraudulent has also attracted a lot of research interest. This workpackage is closely integrated with the visualisation work package and there is now an emphasis on identifying dynamic anomalous structure.
WP 6: Biological Network Analysis
Large genomic datasets have predictive power to identify biological classes, but the number and quality of such sets is expected to increase dramatically as new technologies (e.g. deep sequencing, quantitative mass spectrometry) are applied in model organisms and clinical settings. For example, gene expression patterns have been used to predict cancer types or prognosis, while gene expression matrices obtained in different conditions have been used to classify genes. Matrices may be rectangular and asymmetric, e.g. of genes by conditions, or of genes by promoter motifs, or they may be square and symmetric, such as protein physical interactions, or gene functional pairwise interactions, and these are particularly informative for network structure.