Ghulam Jilani Quadri has been selected as a 2021 CIFellow. Once he defends his PhD, Ghulam will join Danielle Szafir at The University of North Carolina at Chapel Hill.
Congratulations to Hamza Elhamdadi who successfully defended his MS thesis, titled “AffectiveTDA: Using Topological Data Analysis to Improve Analysis and Explainability in Affective Computing”.
This fall, Hamza will be starting his Ph.D. at UMass Amherst.
The PageRank of a graph is a scalar function defined on the node set of the graph which encodes nodes centrality information of the graph. In this article we use the PageRank function along with persistent homology to obtain a scalable graph descriptor and utilize it to compare the similarities between graphs. For a given graph G(V,E), our descriptor can be computed in O(|E|α(|V|)), where α is the inverse Ackermann function which makes it scalable and computable on massive graphs. We show the effectiveness of our method by utilizing it on multiple shape mesh datasets.
Fast And Scalable Complex Network Descriptor Using Pagerank And Persistent Homology
Mustafa Hajij, Paul Rosen, and Elizabeth Munch
International Conference on Intelligent Data Science Technologies and Applications (IDSTA), 2020
The construction of Mapper has emerged in the last decade as a powerful and effective topological data analysis tool that approximates and generalizes other topological summaries, such as the Reeb graph, the contour tree, split, and joint trees. In this paper we study the parallel analysis of the construction of Mapper. We give a provably correct parallel algorithm to execute Mapper on a multiple processors. Our algorithm relies on a divide and conquer strategy for the codomain cover which gets pulled back to the domain cover. We demonstrate our approach for topological Mapper then we show how it can be applied to the statistical version of Mapper. Furthermore, we discuss the performance results that compare our approach to a reference sequential Mapper implementation. Finally, we report the performance experiments that demonstrate the efficiency of our method. To the best of our knowledge this is the first algorithm that addresses the computation of Mapper in parallel.
Mustafa Hajij, Basem Assiri, and Paul Rosen
Proceedings of the Future Technologies Conference, 2020.
The Reeb graph of a scalar function that is defined on a domain gives a topologically meaningful summary of that domain. Reeb graphs have been shown in the past decade to be of great importance in geometric processing, image processing, computer graphics, and computational topology. The demand for analyzing large data sets has increased in the last decade. Hence, the parallelization of topological computations needs to be more fully considered. We propose a parallel augmented Reeb graph algorithm on triangulated meshes with and without a boundary. That is, in addition to our parallel algorithm for computing a Reeb graph, we describe a method for extracting the original manifold data from the Reeb graph structure. We demonstrate the running time of our algorithm on standard datasets. As an application, we show how our algorithm can be utilized in mesh segmentation algorithms.
An Efficient Data Retrieval Parallel Reeb Graph Algorithm
Mustafa Hajij and Paul Rosen
Algorithms: Special Issue on Topological Data Analysis, 2020
CAD course creative projects necessitate subjective feedback. In academia, peer review is a widely used instrument to gather diverse and timely feedback which stimulates learning and engagement in students who review one another. To date, however, no effort to summarize and score subjective content from peer review text via sentiment analysis has been attempted in an educational setting, including CAD courses many of which naturally employ a project-based architecture. This is perhaps due in part to a lack of specifically tuned tools. Towards meeting this need, we introduce a new lexicon compiled from actual peer review text, implemented specifically in a CAD-course context, and compare it to other publicly available lexicons. HeLPS, our domain-dependent lexicon, performed more concisely and accurately in our CAD courses and consistently tagged high-quality positive and negative sentiment with a lexicon a fraction of the size of others. Both qualitative and quantitative evidence suggest that HeLPS is the preferred option for identifying subjective opinion towards CAD course projects.
A Domain-Dependent Lexicon to Augment CAD Peer Review
Z. Beasley and L. Piegl
Computer-Aided Design and Applications, Vol 18, No 1, pp. 186-198, 2021
We present a comprehensive framework for evaluating line chart smoothing methods under a variety of visual analytics tasks. Line charts are commonly used to visualize a series of data samples. When the number of samples is large, or the data are noisy, smoothing can be applied to make the signal more apparent. However, there are a wide variety of smoothing techniques available, and the effectiveness of each depends upon both nature of the data and the visual analytics task at hand. To date, the visualization community lacks a summary work for analyzing and classifying the various smoothing methods available. In this paper, we establish a framework, based on 8 measures of the line smoothing effectiveness tied to 8 low-level visual analytics tasks. We then analyze 12 methods coming from 4 commonly used classes of line chart smoothing-rank filters, convolutional filters, frequency domain filters, and subsampling. The results show that while no method is ideal for all situations, certain methods, such as Gaussian filters and Topology-based subsampling, perform well in general. Other methods, such as low-pass cutoff filters and Douglas-Peucker subsampling, perform well for specific visual analytics tasks. Almost as importantly, our framework demonstrates that several methods, including the commonly used uniform subsampling, produce low-quality results, and should, therefore, be avoided, if possible.
LineSmooth: An Analytical Framework for Evaluating the Effectiveness of Smoothing Techniques on Line Charts
P. Rosen, G.J. Quadri
IEEE Transactions on Visualization and Computer Graphics (VAST 2020)
Scatterplots are used for a variety of visual analytics tasks, including cluster identification, and the visual encodings used on a scatterplot play a deciding role on the level of visual separation of clusters. For visualization designers, optimizing the visual encodings is crucial to maximizing the clarity of data. This requires accurately modeling human perception of cluster separation, which remains challenging. We present a multi-stage user study focusing on 4 factors-distribution size of clusters, number of points, size of points, and opacity of points-that influence cluster identification in scatterplots. From these parameters, we have constructed 2 models, a distance-based model, and a density-based model, using the merge tree data structure from Topological Data Analysis. Our analysis demonstrates that these factors play an important role in the number of clusters perceived, and it verifies that the distance-based and density-based models can reasonably estimate the number of clusters a user observes. Finally, we demonstrate how these models can be used to optimize visual encodings on real-world data.
Modeling the Influence of Visual Density on Cluster Perception in Scatterplots Using Topology
G.J. Quadri, P. Rosen
IEEE Transactions on Visualization and Computer Graphics (InfoVis 2020)
Congratulations to Tanmay ‘TJ’ Kotha who successfully defended his MS thesis, titled “Establishing Topological Data Analysis: A Comparison of Visualization Techniques”.
TJ has already joined Amazon as a Software Engineer.
Congratulations to Zach Beasley who successfully defended his dissertation, “Sentiment Analysis in Peer Review”, on May 29.