본문 바로가기
HOME> 논문 > 논문 검색상세

학위논문 상세정보

Parallel Disk-Based Processing of Massive Graphs on a Single Consumer Machine 원문보기

  • 저자

    KamranNajeebullah

  • 학위수여기관

    Kyung Hee University, Department of Computer Engineering

  • 학위구분

    국내석사

  • 학과

    전자계산공학과

  • 지도교수

  • 발행년도

    2014

  • 총페이지

  • 키워드

    Graph Anlytics Parallel Processing Disk-based Processing;

  • 언어

    eng

  • 원문 URL

    http://www.riss.kr/link?id=T13536648&outLink=K  

  • 초록

    Mining very large graphs has gained great importance due to its wide range of diverse applications. A number of systems, mostly distributed, have been proposed for scale up computations of graphs. In recent past, the focus of research in this area has been on development of a graph processing system that processes the graph on a single consumer machine. This approach is both cost effective and reduces the effort to implement graph algorithms. However, such solutions utilizes external memory and an efficient solution requires a good thought on part of storage and data structures. Moreover, as graphs are processed in chunks such solutions also require an efficient message passing interface that propagates the results of computation from one chunk to another in order to tackle the poor locality problem of graph data. In this thesis, we present two different solutions BiShard Parallel Processor and PostMaster for parallel disk-based processing of graph. Both of the solutions process massive graphs on a single consumer machine solving different issues in the existing systems. BiShard Parallel Processor(BSPP) presents a new storage structure BiShard that significantly reduces the number of non-sequential I/Os. It implement a new processing model named BiShard Parallel (BSP) on top of Bishard storage structure that enables full CPU parallelism for processing the graph.We demonstrate with extensive experiments over real large graphs that our solution show significant performance gains over its prior counterparts. PostMaster introduces a novel Inter-Subgraph Message Passing Interface (ISMPI) that uses an Intra-Subgraph Messages Index(ISMI) to enables sharing of messages among receivers and provide direct access to each messages withour searching. Moreover, PostMaster presents an efficient storage structure that is useful in three ways. 1) it reduces the random disk access by packing everything together. 2) Stores data in large chunks that ensures I/O parallelism on SSD. 3) Requires almost no effort to map to the in memory data structure. We demonstrate through extensive experiments on large real world graph that our solution significantly and consistently outperforms state-of-the-art by more than six times.


 활용도 분석

  • 상세보기

    amChart 영역
  • 원문보기

    amChart 영역