IJCST Logo



 

INTERNATIONAL JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY (IJCST)-VOL III ISSUE II, VER 4, APRIL TO JUNE, 2012


International Journal of Computer Science and Technology Vol. 3 Issue 2, Ver. 4
S.No. Research Topic Paper ID
   131 Label Based Clustering of Detecting Duplicate Videos in a Large Database
Nagulmeera Sayyed, B.Hari Babu

Abstract

The proposed system is an efficient method for detecting duplicate videos using labels created virtually on videos by taking the characteristics( frame rate, Interlacing, Aspect ratio, color space and bits per pixel, bitrate, stereoscopic of a video) as metrics called duplicate detection rules. the system, create a duplicate detection rule for a specific video type finds five different labels for each video in the database, the labels are created according to the Video characteristics. The labels are clusters according to the video type. The duplicate detection rules are imposed on the target video data set and sorted the duplicate videos based on timestamp for a specific source video. Alternatively We follow the previous system work nature to get the accurate results for efficient and accurate method for duplicate video detection in a large database using video fingerprints. We have empirically chosen the color layout descriptor, a compact and robust frame-based descriptor, to create fingerprints which are further encoded by Vector Quantization (VQ). We propose a new nonmetric distance measure to find the similarity between the query and a database video fingerprint and experimentally show its superior performance over other distance measures for accurate duplicate detection. Efficient search cannot be performed for high-dimensional data using a nonmetric distance measure with existing indexing techniques. Therefore, we develop novel search algorithms based on recomputed distances and new dataset pruning techniques yielding practical retrieval times. We perform experiments with a database of 38 000 videos, worth 1600 h of content. For individual queries with an average duration of 60 s (about 50% of the average database video length), the duplicate video is retrieved in 0.032 s, on Intel Xeon with CPU 2.33 GHz, with a very high accuracy of 97.5%.
Full Paper

IJCST/32/4/
A-740
   132 Cooperative Activities of Public Network Information with Edge Clustering
T.J. Sateesh Chandra, K.John Paul

Abstract

It is well known that actors in a network demonstrate correlated behaviors. In this work, we aim to predict the outcome of collective behavior given a social network and the behavioral information of some actors. In particular, we explore scalable learning of collective behavior when millions of actors are involved in the network. Our approach follows a social-dimension based learning framework. Social dimensions are extracted to represent the potential affiliations of actors before discriminative learning occurs. As existing approaches to extract social dimensions suffer from scalability, it is imperative to address the scalability issue. We propose an edge-centric clustering scheme to extract social dimensions and a scalable k-means variant to handle edge clustering. Essentially, each edge is treated as one data instance, and the connected nodes are the corresponding features. Then, the proposed k-means clustering algorithm can be applied to partition the edges into disjoint sets, with each set representing one possible affiliation. With this edge-centric view, we show that the extracted social dimensions are guaranteed to be sparse. This model, based on the sparse social dimensions, shows comparable prediction performance with earlier social dimension approaches. An incomparable advantage of our model is that it easily scales to handle networks with millions of actors while the earlier models fail. This scalable approach offers a viable solution to effective learning of online collective behavior on large scale.
Full Paper

IJCST/32/4/
A-741
   133 Performance of Human Expertise in Service-Oriented Systems
N.Veeranjaneyulu, G.Subba Lakshmi

Abstract

In this paper we motivated the trend towards socio-technical systems in SOA. In such environments social implications must be handled properly. With the human user in the loop numerous concepts, including personalization, expertise involvement, drift interests, and social dynamics become of paramount importance. Therefore, we discussed related Web standards and showed ways to extend them to fit the requirements of a people-centric Web. In particular, we outlined concepts that let people offer their expertise in a service-oriented manner and covered the deployment, discovery and selection of Human-Provided Services. In the future, we aim at providing more fine-grained monitoring and adaptation strategies. An example is the translation service presented in this paper, where some language options are typically used more often, or even more successfully than others. In that case, data types could be modified to reduce the number of available language options in the WSDL interface description and to restrict input parameters. Harnessing delegation patterns that involve various participants, a complex social network perspective is established in which connections are not only maintained between one client and an avatar, but also among avatars.
Full Paper

IJCST/32/4/
A-742
   134 Middle Layer Based Socially-Enhanced Virtual Communities
P.Tanuja Manjira Devi, K.Subhashini

Abstract

Interactions spanning multiple organizations have become an important aspect in today’s collaboration landscape. Organizations create alliances to fulfill strategic objectives. The dynamic nature of collaborations increasingly demands for automated techniques and algorithms to support the creation of such alliances. Our approach bases on the recommendation of potential alliances by discovery of currently relevant competence sources and the support of semi-automatic formation. The environment is serviceoriented comprising humans and software services with distinct capabilities. To mediate between previously separated groups and organizations, We introducing the advanced interaction metric, interaction rate (qualifier) for the further future interaction service bridging, which enhances the broker discovery process. We follow the broker concept that bridges disconnected networks. We present a dynamic broker discovery approach based on interaction mining techniques and trust metrics. We evaluate our approach by using simulations in real Web services’ testbeds.
Full Paper

IJCST/32/4/
A-743
   135 Performance of Asynchronous Sensor Networks
K. Ratna Kumari, B.Hari Babu

Abstract

In most sensor networks the nodes are static. Nevertheless, node connectivity is subject to changes because of disruptions in wireless communication, transmission power changes, or loss of synchronization between neighboring nodes. Hence, even after a sensor is aware of its immediate neighbors, it must continuously maintain its view, a process we call continuous neighbor discovery. In this work we distinguish between neighbor discovery during sensor network initialization and continuous neighbor discovery. We focus on the latter and view it as a joint task of all the nodes in every connected segment. Each sensor employs a simple protocol in a coordinate effort to reduce power consumption without increasing the time required to detect hidden sensors.
Full Paper

IJCST/32/4/
A-744
   136 Presentation of Transfer Through Association True and False Positive by using Watermarking
V. Chandra Kanth, G.Subba Lakshmi

Abstract

Tracing attackers’ traffic through stepping stones is a challenging problem, especially when the attack traffic is encrypted, and its timing is manipulated (perturbed) to interfere with traffic analysis. The random timing perturbation by the adversary can greatly reduce the effectiveness of passive, timing-based correlation techniques. We presented a novel active timing-based correlation approach to deal with random timing perturbations. By embedding a unique watermark into the inter-packet timing, with sufficient redundancy, we can make the correlation of encrypted flows substantially more robust against random timing perturbations. Our analysis and our experimental results confirm these assertions. Our watermark-based correlation is provably effective against correlated random timing perturbation as long as the covariance of the timing perturbations on different packets is fixed. Specifically, the proposed watermark-based correlation can, with arbitrarily small average time adjustment, achieve arbitrarily close to 100% watermark detection (correlation true positive) rate and arbitrarily close to 0% collision (correlation false positive) probability at the same time against arbitrarily large (but bounded) random timing perturbation of arbitrary distribution (or process), as long as there are enough packets in the flow to be watermarked.
Full Paper

IJCST/32/4/
A-745
   137 Active Flow Fairness for Optimized High Speed Downlink Packet Access (HSDPA)
M.Raghu Vamsee, K.John Paul

Abstract

In this paper, we investigate throughput optimization in High Speed Downlink Packet Access (HSDPA). Specifically, we propose packet rate estimator PRE for HSDPA over the network rate and specifies the active flow fairness and optimizes the HSDPA. In the proposed scheme, high-bandwidth flows are identified via a multi-level caching technique and calculate PRE.We follow the previous system method of offline and online algorithms for adjusting the Channel Quality Indicator (CQI) used by the network to enhance the schedule data transmission. In the offline algorithm, a given target BLER is achieved by adjusting CQI based on ACK/ NAK history. By sweeping through different target BLERs, we can find the throughput optimal BLER offline. This algorithm could be used not only to optimize throughput but also to enable fair resource allocation among mobile users in HSDPA. In the online algorithm, the CQI offset is adapted using an estimated short term throughput gradient without specifying a target BLER. An adaptive stepsize mechanism is proposed to track temporal variation of the environment. We investigate convergence behavior of both algorithms. Simulation results show that the proposed offline algorithm can achieve the given target BLER with good accuracy. Both algorithms yield up to 30% HSDPA throughput improvement over that with 10% target BLER.
Full Paper

IJCST/32/4/
A-746
   138 Precision and Competence in Interruption Detection System
S.Aruna, B.Hari Babu

Abstract

Intrusion detection faces a number of challenges; an intrusion detection system must reliably detect malicious activities in a network and must perform efficiently to cope with the large amount of network traffic. In this paper, we address these two issues of Accuracy and Efficiency using Conditional Random Fields and Layered Approach and automatic pre defined user prompts. We demonstrate that high attack detection accuracy can be achieved by using Conditional Random Fields and high efficiency by implementing the Layered Approach and prompts . Experimental results on the benchmark KDD ’99 intrusion data set show that our proposed system based on Layered Conditional Random Fields outperforms other well-known methods such as the decision trees and the naive Bayes. The improvement in attack detection accuracy is very high, particularly, for the U2R attacks (34.8 percent improvement) and the R2L attacks (34.5 percent improvement). Statistical Tests also demonstrate higher confidence in detection accuracy for our method. Finally, we show that our system is robust and is able to handle noisy data without compromising performance.
Full Paper

IJCST/32/4/
A-747
   139 Performance Evaluation of Ranking-Based Techniques
B.Jayasree, CH. Raja Jacob

Abstract

Recent years and many techniques have been proposed to improve the recommendation quality. However, in most cases, new techniques are designed to improve the accuracy of recommendations, whereas the recommendation diversity has often been overlooked. In particular, we showed that, while ranking recommendations according to the predicted rating values (which is a de facto ranking standard in recommender systems) provides good predictive accuracy, it tends to perform poorly with respect to recommendation diversity. Therefore, in this paper, we proposed a number of recommendation ranking techniques that can provide significant improvements in recommendation diversity with only a small amount of accuracy loss. In addition, these ranking techniques offer flexibility to system designers, since they are parameterizable and can be used in conjunction with different rating prediction algorithms (i.e., they do not require the designer to use only some specific algorithm). They are also based on scalable sorting based heuristics and, thus, are extremely efficient. We provide a comprehensive empirical evaluation of the proposed techniques and obtain consistent and robust diversity improvements across multiple real-world datasets and using different rating prediction techniques. This work gives rise to several interesting directions for future research. In particular, additional important item ranking criteria should be explored for potential diversity improvements. This may include consumeroriented or manufacturer oriented ranking mechanisms, depending on the given application domain, as well as external factors, such as social networks. Also, as mentioned earlier, optimization-based approaches could be used to achieve further improvements in recommendation diversity, although these improvements may come with a (possibly significant). Increase in computational complexity. Moreover, because of the inherent tradeoff between the accuracy and diversity metrics, an interesting research direction would be to develop a new measure that captures both of these aspects in a single metric.
Full Paper

IJCST/32/4/
A-748
   140 Performance of Optimal Service Pricing for Correlations
P. Rajdeepak, K.John Paul

Abstract

This work proposes a novel pricing scheme designed for a cloud cache that offers querying services and aims at the maximization of the cloud profit. We define an appropriate price-demand model and we formulate the optimal pricing problem. The proposed solution allows: on one hand, long-term profit maximization, and, on the other, dynamic calibration to the actual behavior of the cloud application, while the optimization process is in progress. We discuss qualitative aspects of the solution and a variation of the problem that allows the consideration of user satisfaction together with profit maximization. The viability of the pricing solution is ensured with the proposal of a method that estimates the correlations of the cache services in an time-efficient manner.
Full Paper

IJCST/32/4/
A-749
   141 Concert Assessment Corresponding Algorithms for Firewalls
P.Sindhura, P.Nageswara Rao

Abstract

We have seen that the GEM algorithm is an efficient and practical algorithm for firewall packet matching. We implemented it successfully in the Linux kernel, and tested its packet-matching speeds on live traffic with realistic large releases. GEM’s matching speed is far better than the naive linear search and it is able to increase the throughput of iptables by an order of magnitude. On rule-bases generated according to realistic statistics, GEM’s space complexity is well within the capabilities of modern hardware. Thus we believe that GEM may be a good candidate for use in firewall matching engines. We note that there are other algorithms that may well be candidates for software implementation in the kernel. We believe it should be quite interesting to implement all of these algorithms and to test them on equal footing, using the same hardware, rule-bases, and traffic load. Furthermore, it would be interesting to do this comparison with real rule-bases, in addition to synthetic Perimeter-model rules. We leave such a “bake-off” for future work. As for GEM itself, we would like to explore the algorithm’s behavior when using more than 4 fields, e.g., matching on the TCP flags, meta data, interfaces, etc. The main questions are: How best to encode the non-range fields? Will the space complexity still stay close to linear? What will be the best order of fields to achieve the best space complexity? Another direction to pursue is how GEM would perform with of IPv6, in which IP addresses have 128 bits.
Full Paper

IJCST/32/4/
A-750
   142 The Correlation of Scalable Watermarking to Detect Traffic Flow Attacks in the Network
M.Prem Swarup, M.V.Rajesh

Abstract

Tracing attackers’ traffic through stepping stones is a challenging problem, especially when the attack traffic is encrypted, and its timing is manipulated (perturbed) to interfere with traffic analysis. The random timing perturbation by the adversary can greatly reduce the effectiveness of passive, timing-based correlation techniques. We design SWIRL, a Scalable Watermark that is invisible and resilient to packet Losses. SWIRL is the first watermark that is practical to use for large-scale traffic analysis. SWIRL uses a flow dependent approach to resist multi flow attacks. Marking each flow with a different pattern. SWIRL is robust to packet losses and network jitter, yet it introduces only small delays that are invisible to beth benign users and determined adversaries. We analyze the performance of SWIRL both analytically and on the Planet Lab tested, , demonstrating very low error rates. We consider applications of SWIRL to stepping stone detection and linking anonymous communication.
Full Paper

IJCST/32/4/
A-751
   143 Performance of SKSE and MRSE in Cloud Cache
M.Sandhya, CH. Raja Jacob

Abstract

In this paper, for the first time we define and solve the problem of multi-keyword ranked search over encrypted cloud data, and establish a variety of privacy requirements. Among various multikeyword semantics, we choose the efficient principle of “coordinate matching”, i.e., as many matches as possible, to effectively capture similarity between query keywords and outsourced documents, and use “inner product similarity” to quantitatively formalize such a principle for similarity measurement. For meeting the challenge of supporting multi-keyword semantic without privacy breaches, we first propose a basic MRSE scheme using secure inner product computation, and significantly improve it to achieve privacy requirements in two levels of threat models. Thorough analysis investigating privacy and efficiency guarantees of proposed schemes is given, and experiments on the real-world dataset show our proposed schemes introduce low overhead on both computation and communication. As our future work, we will explore supporting other multi-keyword semantics (e.g., weighted query) over encrypted data, integrity check of rank order in search result and privacy guarantees in more stronger threat model.
Full Paper

IJCST/32/4/
A-752
   144 Enhanced IR Schemes for Summarizing for Relational Databases
K.Chandra Mouli, G.Aparna

Abstract

Commercial relational database management systems (RDBMSs) generally provide querying capabilities for text attributes that incorporate state-of-the-art information retrieval (IR) relevance ranking strategies, but this search functionality requires that queries specify the exact column or columns against which a given list of keywords is to be matched. This requirement can be cumbersome and inflexible from a user perspective: good answers to a keyword query might need to be “assembled” –in perhaps unforeseen ways– by joining tuples from multiple relations. This observation has motivated recent research on free-form keyword search over RDBMSs. In this paper, we adapt IR-style documentrelevance ranking strategies to the problem of processing freeform keyword queries over RDBMSs. Our query model can handle queries with both AND and OR semantics, and exploits the sophisticated single-column text-search functionality often available in commercial RDBMSs. We develop query-processing strategies that build on a crucial characteristic of IR-style keyword search: only the few most relevant matches –according to some definition of “relevance”– are generally of interest. Consequently, rather than computing all matches for a keyword query, which leads to inefficient executions, our techniques focus on the top-k matches for the query, for moderate values of k.
Full Paper

IJCST/32/4/
A-753
   145 Unstructured P2P Networks-Multidimensional Historical Collection Data
V. Suresh, Amzed Ali Shaik

Abstract

Efficient handling of multidimensional data is a challenging issue in P2P systems. P2P is a distributed application architecture that partitions tasks or workloads among peers. Peers are equally privileged, equipotent participants in the application. Each computer in the network is referred to a node. The owner of each computer on a P2P network would set aside a portion of its resources, such as processing power, disk storage or network bandwidth, directly available to other network participants, without the need for central coordination by servers or stable hosts. A P2P-based framework supporting the extraction of aggregates from historical multidimensional data is proposed, which provides efficient and robust query evaluation. When a data population is published, data are summarized in a synopsis, consisting of an index built on top of a set of subsynopses (storing compressed representations of distinct data portions). The index and the subsynopses are distributed across the network, and suitable replication mechanisms taking into account the query workload and network conditions are employed that provide the appropriate coverage for both the index and the subsynopses.
Full Paper

IJCST/32/4/
A-754
   146 Excellence Preferences by using Spatial Databases
Shaik Khaja Fayazuddin, K.John Paul

Abstract

A spatial preference query ranks objects based on the qualities of features in their spatial neighborhood. For example, using a real estate agency database of flats for lease, a customer may want to rank the flats with respect to the appropriateness of their location, defined after aggregating the qualities of other features (e.g., rest aurants, cafes, hospital, market, etc.) within their spatial neighborhood. Such a neighborhood concept can be specified by the user via different functions. It can be an explicit circular region within a given distance from the flat. Another intuitive definition is to assign higher weights to the features based on their proximity to the flat. In this paper, we formally define spatial preference queries and propose appropriate indexing techniques and search algorithms for them. Extensive evaluation of our methods on both real and synthetic data reveals that an optimized branchand- bound solution is efficient and robust with respect to different parameters.
Full Paper

IJCST/32/4/
A-755
   147 A High-Speed Tree Pattern Matching Algorithm for XML Query
D. Anusha, G. Ramesh Naidu, N. Balayesu

Abstract

Finding all distinct matching’s of the query tree pattern is the main operation of XML query evaluation. The existing methods we research a large set of XML tree pattern, called extended XML tree pattern, which may include P-C, A-D relationships, negation functions, wildcards and order restriction. We establish a theoretical framework about “matching cross” which demonstrates the intrinsic reason in the proof of optimality on holistic algorithms for tree pattern matching are decomposition matching-merging processes, which may produce large useless intermediate result or require repeated matching of some sub-patterns. We propose a fast tree pattern matching algorithm called TreeMatch to directly find all distinct matchings of a query tree pattern. The only requirement for the data source is that the matching elements of the non-leaf pattern nodes do not contain sub-elements with the same tag. The TreeMatch does not produce any intermediate results and the final results are compactly encoded in stacks, from which the explicit representation can be produced efficiently. It can effectively control the size of intermediate results during query processing.
Full Paper

IJCST/32/4/
A-756
   148 Dynamic Hierarchical Mobility Management Approach for IP-Based Mobile Networks
A. Lakshman Rao, P. Jagadeesh Kumar

Abstract

One of the major challenges for the wireless network design is the efficient mobility management, which can be addressed globally (macromobility) and locally (micromobility). Mobile Internet Protocol (IP) is a commonly accepted standard to address global mobility of Mobile Hosts (MHs). It requires the MHs to register with the Home Agents (HAs) whenever their care-of addresses change. Several mobility management strategies have been proposed which aim reducing the signaling traffic related to the Mobile Terminals (MTs) registration with the Home Agents (HAs) whenever their Care-of-Addresses (CoAs) change. They use different Foreign Agents (FAs) and Gateway FAs (GFAs) hierarchies to concentrate the registration processes. However, such registrations may cause excessive signaling traffic and long service delay. To solve this problem, the Hierarchical Mobile IP (HMIP) Protocol was proposed to employ the hierarchy of Foreign Agents (FAs) and the Gateway FAs (GFAs) to localize registration operations. However, the system performance is critically affected by the selection of GFAs and their reliability. In this paper, we introduce a novel dynamic hierarchical mobility management strategy for mobile IP networks, in which different hierarchies are dynamically set up for different users and the signaling burden is evenly distributed among the network. To justify the effectiveness of our proposed scheme, we develop an analytical model to evaluate the signaling cost. The proposed dynamic hierarchical mobility management strategy can significantly reduce the system signaling cost under various scenarios and the system robustness is greatly enhanced.
Full Paper

IJCST/32/4/
A-757
   149 Energy Maps for Wireless Mobile Networks
G. Santhi, P. Sandya Devi

Abstract

Wireless networks consist of mobile nodes interconnected by multihop communication paths. Energy planning and optimization constitutes one of the most significant challenges for high-mobility networks. This paper proposes “energy maps”, which are maps of the end-to-end energy metrics between physical locations in space. The energy maps are very valuable in designing energy optimization and planning protocols in high-mobility networks. In this paper, a novel framework is proposed to share, retain and refine end- to-end energy metrics in the joint memory of the nodes, over time scales over which this information can be spread to the network and utilized for energy planning decisions. We construct maps of end-to-end energy metrics that enable energy optimization in high-mobility networks. We show how to (1) compute the spatial derivatives of energy potentials in high-mobility networks, (2) construct energy maps on-demand via path integration methods, (3) distribute, share, fuse, and refine energy maps over time by information exchange during encounters.
Full Paper

IJCST/32/4/
A-758
   150 A Novel Multiview Point Clusterization
S. K. A. Manoj, V. Vijaya Lakshmi

Abstract

Clustering methods have to assume some cluster relationship among the data objects that they are applied on. Similarity between a pair of objects can be defined either explicitly or implicitly. The major difference between a traditional dissimilarity/similarity measure and ours is that the former uses only a only a single viewpoint, which is the origin, while the latter utilizes many different viewpoints, which are objects assumed to not be in the same cluster with the two objects being measured. Using multiple viewpoints, more informative assessment of similarity could be achieved. Theoretical analysis and empirical study are conducted to support this claim. Two criterion functions for document clustering are proposed based on this new measure. we compare them with several well-known clustering algorithms that use other popular similarity measures on various document collections to verify the advantages of our proposal.
Full Paper

IJCST/32/4/
A-759
   151 An Efficient Query Navigation Based on Concept Hierarchies
Suchitra Reyya, Sivakoti Nagma, 3Sangeeta Viswanadh

Abstract

Search queries on biomedical databases, such as PubMed, often return a large number of results, only a small subset of which is relevant to the user. Ranking and categorization, which can also be combined, have been proposed to alleviate this information overload problem. Result optimization and results categorization for biomedical databases is the focus of this work. A natural way to organize biomedical citations is according to their MeSH annotations. MeSH is a comprehensive concept hierarchy used by PubMed. In this paper, we present the BioIntelR (BIR) system, adopts the BioNav system enables the user to navigate large number of query results by organizing them using the MeSH concept hierarchy. First, BioIntelR (BIR) system prompts the user for the search criteria and the system automatically connects to a middle layer created at the application level which directs the query to the proper valid query path to select correct criteria of the search result from the biomedical database. The query results are organized into a navigation tree. At each node expansion step, BIR system reveals only a small subset of the concept nodes, selected such that the expected user navigation cost is minimized. In contrast, to the previous systems, the BIR system outperforms and optimizes the query result time and minimizes query result set for easy user navigation, Data Warehousing.
Full Paper

IJCST/32/4/
A-760
   152 Secure and Efficient Transformation by using Caching Stratergies: Wireless Networks
B.Ganga Bhavani, M.Vishnu Murthy

Abstract

Most researches in ad hoc networks focus on routing, and not much work has been done on data access. A common technique used to improve the performance of data access is caching. Cooperative caching, which allows the sharing and coordination of cached data among multiple nodes, can further explore the potential of the caching techniques. Due to mobility and resource constraints of ad hoc networks, cooperative caching techniques designed for wired network may not be applicable to ad hoc networks. We address cooperative caching in wireless networks, where the nodes may be mobile and exchange information in a peer-to-peer fashion. We consider both cases of nodes with large-and smallsized caches. For large-sized caches, we devise a strategy where nodes, independent of each other, decide whether to cache some content and for how long. In the case of small-sized caches, we aim to design a content replacement strategy that allows nodes to successfully store newly received information while maintaining the good performance of the content distribution system. Under both conditions, each node takes decisions according to its perception of what nearby users may store in their caches and with the aim of differentiating its own cache content from the other nodes’. The result is the creation of content diversity within the nodes neighborhood so that a requesting user likely finds the desired information.
Full Paper

IJCST/32/4/
A-761
   153 Temporal-Spatial Local Gaussian Process Experts with Vision Based Human Motion Tracking
Satyanarayana Mummana, T. Ravi Kiran

Abstract

Human pose estimation via motion tracking systems can be considered as a regression problem within a discriminative framework. It is always a challenging task to model the mapping from observation space to state space because of the high dimensional characteristic in the multimodal conditional distribution. In order to build the mapping, existing techniques usually involve large set of training samples in the learning process which are limited in their capability to deal with multimodality. We propose , in this work, a novel online sparse Gaussian process regression model to recover 3-D human motion in monocular videos. Particularly, we investigate the fact that for a given test input, its output is mainly determined by the training samples potentially residing in its local neighborhood and defined in the unified input-output space.
Full Paper

IJCST/32/4/
A-762
   154 Apply Filter for Hidden Sensitive Association Rule with Limited Side Effects
B.Venkateswarlu, Vagesna Anusha

Abstract

Data mining techniques have been widely used in various applications. However, the misuse of these techniques may lead to the disclosure of sensitive information. Researchers have recently made efforts at hiding sensitive association rules. Nevertheless, undesired side effects, e.g., nonsensitive rules falsely hidden and spurious rules falsely generated, may be produced in the rule hiding process. In this paper, we present a novel approach that strategically In this paper, we present a novel approach that strategically modifies a few transactions in the transaction database to decrease the supports or confidences of sensitive rules without producing the side effects and applies the filters for redundant rules in hidden sensitive rules . Since the correlation among rules can make it impossible to achieve this goal, in this paper, we propose heuristic methods for increasing the number of hidden sensitive rules and reducing the number of modified entries applies filters. The experimental results show the effectiveness of our approach, i.e., undesired side effects are avoided in the rule hiding process. The results also report that in most cases, all the sensitive rules are hidden without spurious rules falsely generated. Moreover, the good scalability of our approach in terms of database size and the influence of the correlation among rules on rule hiding are observed. The data which has been processed are supports for the better decision making.
Full Paper

IJCST/32/4/
A-763
   155 Grading Spatial Data By Value Preferences using N2S2 Algorithm
Suchitra Reyya O, M.Ramesh Babu

Abstract

Customer liking queries are very important in spatial databases. We propose a definition of a spatial database system as a database system that offers spatial data types in its data model and query language and supports spatial data types in its implementation, providing at least spatial indexing and spatial join methods. Spatial database systems offer the underlying database technology for geographic information systems and other applications. We survey data modeling, querying, data structures and algorithms, and system architecture for such systems. The emphasis is on describing known technology in a coherent manner rather than on listing open problems.. The featured score of a given object is derived from the quality of features (e.g., location and nearby features) in its spatial neighborhood This neighborhood concept can be defined by different functions by the user. It can be an explicit circular region within a given distance from the flat. Another sensitive definition is to assign higher rates to the features based on their proximity to the land. In this paper, we formally define spatial preference queries and propose suitable dynamic index techniques and searching algorithms for them. We extend results with dynamic index structure in order to accommodate time – variant changes in the spatial data. In my current work is the top-k spatial preference query on road network, in which the distance between object and road is defined by their shortest path distance. By separating this query as a subset of dynamic skyline queries N2S2 algorithm is provided for computing it. This algorithm has good performance compared with the general branch and bound algorithm for skyline queries.
Full Paper

IJCST/32/4/
A-764
   156 Document Clustering With Genetic Based k-Means Algorithm
Cheekatla Swapna Priya, Mortha Samuel Jacob Sam

Abstract

There are two important problems worth conducting research in the fields of personalized information services based on user model. One is how to get and describe user personal information, i.e. building user model, the other is how to organize the information resources, i.e. document clustering. It is difficult to find out the desired information without a proper clustering algorithm. Several new ideas have been proposed in recent years. But most of them only took into account the text information, but some other useful information may have more contributions for documents clustering, such as the text size, font and other appearance characteristics, so called visual features.In this paper we introduce a new technique called Closed Document Clustering Method (CDCM) by using advanced clustering metrics. This method enhances the previous method of cluster the scientific documents based on visual features, so called VF-Clustering algorithm. Five kinds of visual features of documents are defined, including body, abstract, subtitle, keyword and title. The thought of crossover and mutation in genetic algorithm is used to adjust the value of k and cluster center in the k-means algorithm dynamically. Experimental result supports our approach as better concept.The main aim ofthis paper is to eliminate the redundant documents and set priority to each document in the cluster. In the five visual features, the clustering accuracy and steadiness of subtitle are only less than that of body, but the efficiency is much better than body because the subtitle size is much less than body size. The accuracy of clustering by combining subtitle and keyword is better than each of them individually, but is a little less than that by combining subtitle, keyword and body. If the efficiency is an essential factor, clustering by combining subtitle and keyword can be an optimal choice. The proposed system outperforms than the previous system.
Full Paper

IJCST/32/4/
A-765
   157 Virtualized Heterogeneous Cluster System to Automatically Balance The Workload
R. Sailaja, M.V. Sheela Devi

Abstract

System virtualization can aggregate the functionality of multiple standalone computer systems into a single hardware computer. It is significant to virtualize the computing nodes with multicore processors in the cluster system, in order to promote the usage of the hardware while decrease the cost of the power. In the virtualized cluster system, multiple virtual machines are running on a computing node. However, it is a challenging issue to automatically balance the workload in virtual machines on each physical computing node, which is different from the traditional cluster system’s load balance. In this paper, we follow management framework for the virtualized cluster system, and automatic performance tuning strategy to balance the workload in the virtualized cluster system. In this paper we propose a VMViewer which allocates the resources dynamically to multiple VMs which increases the performance of the cluster system. The experimental result indicates that the proposed method enhances the performance with compare to the previous systems.
Full Paper

IJCST/32/4/
A-766
   158 Secure Data by Limiting CN and DOS Attacks in Ad-Hoc Network with Sensor using Randomized Dispersive Routing
P. Vijay Kumar, M. Ram Bhupal

Abstract

Compromised node and denial of service are two key attacks in wireless sensor networks. Data delivery mechanisms that can with high probability circumvent black holes formed by thee attacks. We argue that classic multipath routing approaches are vulnerable to such attacks, mainly due to their deterministic nature. So once the adversary acquires the routing algorithm it can compute the same routes know to the source, hence, making all information sent over these routes vulnerable to its attacks. We developed mechanisms that generate randomized multipath routes. Under our design, the routes taken by the “ shares” of different packets change over time. So even if the routing algorithm because known to the adversary, the adversary still cannot pinpoint the routes traversed by each packet. Besides randomness, the generated routes are also highly dispersive and energy efficient, making them quite capable of circumventing black holes. We analytically investigate the security and energy performance of the proposed scheme.
Full Paper

IJCST/32/4/
A-767
   159 Analyze and Update Data Method for Skyline Query Processing Against Distributed Data Sites
Nagalakshmi. Panchakatla, D. Umadevi, Shaheda Akthar

Abstract

The skyline of a multidimensional point set is a subset of interesting points that are not dominated by others. In this paper, We are introducing the Analyze and Update data Method which enhances the constrained Skyline query processing against distributed data sites approach and user centric approach. User can easly compares the result data dynamically. we follow constrained skyline queries in a large-scale unstructured distributed environment, where relevant data are distributed among geographically scattered sites.A partition algorithm that divides all data sites into incomparable groups such that the skyline computations in all groups can be parallelized without changing the final result, then develop a novel algorithm framework called PaDSkyline for parallel skyline query processing among partitioned site groups. The method also employ intragroup optimization and multifiltering technique to improve the skyline query processes within each group. In particular, multiple (local) skyline points are sent together with the query as filtering points, which help identify unqualified local skyline points early on a data site. In this way, the amount of data to be transmitted via network connections is reduced, and thus, the overall query response time is shortened further. Cost models and heuristics are proposed to guide the selection of a given number of filtering points from a superset. A cost efficient model is developed to determine how many filtering points to use for a particular data site. The results of an extensive experimental study demonstrate that our proposals are effective and efficient.
Full Paper

IJCST/32/4/
A-768
   160 Data Analysis using Adaptive Join Operators for Result Rate Optimization on Streaming Inputs
Dr. G. Srinivas, Dr. V. Suryanarayana, Sri. B. Mouleswara Rao

Abstract

Adaptive join algorithms have recently attracted a lot of attention in emerging applications where data are provided by autonomous data sources through heterogeneous network environments. Their main advantage over traditional join techniques is that they can start producing join results as soon as the first input tuples are available, thus, improving pipelining by smoothing join result production and by masking source or network delays. In this paper, we are improving the present techniques by introducing data analysis over heterogeneous network environments by keeping the same result rate optimization on streaming inputs. We follow the (DINER) Double Index NEsted-loops Reactive join a new adaptive twoway join algorithm for result rate maximization. Multiple Index NEsted-loop Reactive join (MINER) is a multiway join operator that inherits its principles from DINER. Our experiments using real and synthetic data sets demonstrate that DINER outperforms previous adaptive join algorithms in producing result tuples at a significantly higher rate, while making better use of the available memory. Our experiments also shows that in the presence of multiple inputs, MINER manages to produce a high percentage of early results, outperforming existing techniques for adaptive multiway join.
Full Paper

IJCST/32/4/
A-769
   161 An Advanced Clustering Algorithm for Text Classification Problem
Dr. C. P. V. N. J. Mohan Rao, T. T. Rajeswara Rao

Abstract

It investigates a novel algorithm-EGA-SVM for text classification problem by combining Support Vector Machines (SVM) with elitist Genetic Algorithm (GA). The new algorithm uses EGA, which is based on elite survival strategy, to optimize the parameters of SVM. Iris dataset and one hundred pieces of news reports in Chinese news are chosen to compare EGA-SVM, GA-SVM and traditional SVM. The results of numerical experiments show that EGA-SVM can improve classification performance effectively than the other algorithms. This text classification algorithm can be extended easily to apply to literatures in the field of electrical engineering. Feature clustering is a powerful method to reduce the dimensionality of feature vectors for text classification. In this paper, we propose a fuzzy similarity-based self-constructing algorithm for feature clustering. The words in the feature vector of a document set are grouped into clusters, based on similarity test. Words that are similar to each other are grouped into the same cluster. Each cluster is characterized by a membership function with statistical mean and deviation. When all the words have been fed in, a desired number of clusters are formed automatically. We then have one extracted feature for each cluster. The extracted feature, corresponding to a cluster, is a weighted combination of the words contained in the cluster. By this algorithm, the derived membership functions match closely with and describe properly the real distribution of the training data. Besides, the user need not specify the number of extracted features in advance, and trialand- error for determining the appropriate number of extracted features can then be avoided.
Full Paper

IJCST/32/4/
A-770
   162 An Automated Advanced Clustering Algorithm for Text Classification
G. Narasimha Rao, R. Ramesh, D. Rajesh, D. Chandra Sekhar

Abstract

A major characteristic of text document classifica-tion problem is extremely high dimensionality of text data. In this paper we present four new algorithms for feature/word selection for the purpose of text classification. We use sequential forward selection methods based on improved mutual information criterion functions. The performance of the proposed evaluation functions compared to the information gain which evaluate features individually is discussed. We present experimental results using naive Bayes classifier based on multinomial model, linear support vector machine and k-nearest neighbor classifiers on the Reuters data set. Feature clustering is a powerful method to reduce the dimensionality of feature vectors for text classification. In this paper, we propose a fuzzy similarity-based self-constructing algorithm for feature clustering. The words in the feature vector of a document set are grouped into clusters, based on similarity test. Words that are similar to each other are grouped into the same cluster. Each cluster is characterized by a membership function with statistical mean and deviation. When all the words have been fed in, a desired number of clusters are formed automatically. We then have one extracted feature for each cluster. The extracted feature, corresponding to a cluster, is a weighted combination of the words contained in the cluster. By this algorithm, the derived membership functions match closely with and describe properly the real distribution of the training data. Besides, the user need not specify the number of extracted features in advance, and trialand- error for determining the appropriate number of extracted features can then be avoided.
Full Paper

IJCST/32/4/
A-771
   163 Intrusion Response System for Relational Databases by using Secondary Level Authentication
Priscilla. Gummadi, Hari Krishna. Deevi, Dr. K. Rama Krishnaiah, G. Sudhir

Abstract

The intrusion response component of an overall intrusion detection system is responsible for issuing a suitable response to an anomalous request. We propose the Secondary Service Link Authenticator (SSLA) to support the intrusion response system. We follow the notion of database response policies to support our intrusion response system tailored for a DBMS. The interactive response policy language makes it very easy for the database administrators to specify appropriate response actions for different circumstances depending upon the nature of the anomalous request. The two main issues that we address in context of such response policies are that of policy matching, and policy administration. For the policy matching problem, we use two algorithms that efficiently search the policy database for policies that match an anomalous request. The experimental evaluation shows that our techniques are very efficient. The other issue that we address is that of administration of response policies to prevent malicious modifications to policy objects from legitimate users. We propose a novel Joint Threshold Administration Model (JTAM) that is based on the principle of separation of duty. The key idea in JTAM is that a policy object is jointly administered by at least k database administrator (DBAs), that is, any modification made to a policy object will be invalid unless it has been authorized by at least k DBAs. The design details of JTAM which is based on a cryptographic threshold signature scheme, and show how JTAM prevents malicious modifications to policy objects from authorized users and follws the SSLA.. We also implement JTAM in the PostgreSQL DBMS, and report experimental results on the efficiency of our techniques.
Full Paper

IJCST/32/4/
A-772
   164 Managing Multidimensional Historical Aggregate Data in Unstructured P2P Networks using Virtual Aggregate Cubes
Devika Rani. Kadari, Hari Krishna. Deevi, Dr. K. Rama Krishnaiah, G. Sudhir

Abstract

A P2P-based framework supporting the extraction of aggregates from historical multidimensional data is proposed, which provides efficient and robust query evaluation. When a data population is published, data are summarized in a synopsis, consisting of an index built on top of a set of subsynopses (storing compressed representations of distinct data portions). The index and the subsynopses are distributed across the network, and suitable replication mechanisms taking into account the query workload and network conditions are employed that provide the appropriate coverage for both the index and the subsynopses. In this paper we are introducing virtual Datacubes which holds historical aggregated data at different level of aggregations following the subsynopses.
Full Paper

IJCST/32/4/
A-773
   165 Performance of Web Information Gathering System using Ontology and Information Extraction
Y. Ramesh Kumar, U. Kartheek Chandra Patnaik, G. Ananth Kumar

Abstract

Traditional methods on Information Extraction (IE) have focused on the use of supervised learning techniques such as hidden Markov models, self-supervised methods, rule learning, and Conditional Random Fields (CRF). WebNLP framework is based on CRF’s and markov models. These techniques learn a language model or a set of rules from a set of hand-tagged training documents and then apply the model or rules to new texts. Models learned in this manner are effective on documents similar to the set of training documents, but extract quite poorly when applied to documents with a different genre or style which is usually found on web. As a result, this approach has difficulty scaling to the Web due to the diversity of text styles and genres on the Web and the prohibitive cost of creating an equally diverse set of hand tagged documents. In this paper we propose an adaptive IE system which uses Ontology Based Information Extraction techniques that extracts all relations by learning a set of lexico-syntactic patterns unlike WebNLP. It permits greater machine interpretability of content than that supported by XML, RDF and RDF Schema (RDF-S), by providing additional vocabulary along with a formal semantics. So, ontologies represent an ideal knowledge background in which to base text understanding and enable the extraction of relevant information.
Full Paper

IJCST/32/4/
A-774