MULTI-DIMENSIONAL DATA PREPARATION: A PROCESS TO SUPPORT VULNERABILITY ANALYSIS AND CLIMATE CHANGE ADAPTATION

:-Agriculture is the backbone of a country’s economic system, considering that it not only provides food and raw materials but also employment opportunities for a large percentage of the population. In this way, determining the degree of agricultural vulnerability represents a guide for sustainability and adaptability focused on changing future conditions. In many cases, vulnerability analysis data is restricted to use by authorized personnel only, leaving open data policies aside. Furthermore, data in its native format (raw data) by nature tend to be diverse in structure, storage formats, and access protocols. In addition, having a large amount of open data is important (though not sufficient) to obtain accurate results in data-driven analysis. These data require a strict preparation process and having guides that facilitate this process is becoming increasingly necessary. In this study, we present the step by step processing of several open data sources in order to obtain quality information for feedback on different agricultural vulnerability analysis. The data preparation process is applied to a case study corresponding to the upper Cauca river basin in Colombia. All data sources in this study are public, official and are available from different web platforms where they were collected. In the same way, a ranking with the importance of variables for each dataset was obtained through automatic methods and validated through expert knowledge. Experimental validation showed an acceptable agreement between the ranking of automatic methods and the ranking of raters. Weather Prediction can be done by using various classifiers in data mining techniques. Some of the techniques that assist in Weather Prediction are discussed in the following. Each technique has its own advantages and disadvantages and each of the classification technique becomes handy depending upon the requirement and the conditions. Naïve Bayes: This Naïve Bayes classifier depends on easiest Bayesian system models. This classifier works on Bayes theorem. It predicts the probabilities for each record to have membership in a class. This classifier is exceptionally versatile requiring various parameters in an issue. It is based on conditional probability and the attributes however independent with each other. The class with highest probability is known as Maximum a Posteriori (MAP).


INTRODUCTION:-
Agriculture is one of the activities most affected by climatic factors. It not only provides food and raw materials but also employment opportunities for a large percentage of the population. Although agriculture contributes approximately 5% to 7% of GDP in modern economies, as this percentage increases, the economic system becomes more vulnerable. The effects of variability and climate change on food production are now a reality. These phenomena have begun to affect the production of the ten main crops (barley, cassava, corn, palm oil, rapeseed, rice, sorghum, soybeans, sugarcane, and wheat), which represent key food sources for human beings. These food sources represent 83% of all calories produced on arable land, and for this reason, understanding how much can be affected has become an urgent task for researchers around the world. In this way, determining the degree of agricultural vulnerability represents a guide for sustainability and adaptability focused on changing future conditions. Agricultural vulnerability has become a fundamental basis for analyzing the risks of climate variability. In recent decades, several studies have focused on analyzing and measuring this type of risk.  This systematic mapping was developed using the methodology proposed by Petersen, where the databases of Scopus, Google Scholar, and Science Direct were consulted. 80 related works were found and classified into three topics: environmental, soil and crops, and water supply. We highlight those related to agriculture and food security below. In this sense, RHoMIS (Rural Household of Multiple Indicators Survey) is a methodology composed of surveys and databases to monitor the agricultural sector through food systems. Likewise, the International Model for Policy Analysis of Agricultural Commodities and Trade (IMPACT) is a network of economic, water, and related crop models that simulates national and international agricultural markets. Following this line of research but at a regional scale, several approaches integrate physical, agro ecological and socioeconomic indicators. These indicators were grouped into the components of exposure, sensitivity, and adaptability using a composite index method. Finally, Agriculture, Vulnerability, and Adaptability (AVA) is a methodology for calculating the vulnerability of productive systems in the upper Cauca River basin in Colombia through multiple key indicators.
These types of approaches collect and generate valuable information in workshops organized between different stakeholders. However, to obtain acceptable results there are Some limitations that increase the complexity of the entire process. The first corresponds to the type of analysis (qualitative or quantitative) developed in these studies. Using a qualitative approach had several challenges, thus, most approaches are predominantly quantitative. The second refers to the difficulty in reaching an agreement between participants. Sometimes the panels of experts become unpleasant experiences by not reaching a consensus among the stakeholders. This leads to a third limitation which lies in the time required to implement the analyses. The enormous amount of non-trivial work represents a high time of analysis.
The above implies having sufficient and relevant data sources for such analyses. However, data in its native format (raw data) by nature tend to be diverse in structure, storage formats, and access protocols. There are often intrinsic spatial-temporal relationships between different data sources, which may offer relevant knowledge for a given information query [9]. This need is often unsatisfied due to data inconsistencies. If an adequate cleaning process is not applied, subsequent analyses will not be accurate enough. In other words, a great deal of information and knowledge will be lost when entering erroneous data ("garbage in, garbage out").

DOMAIN INTRODUCTION:-
Weather forecasting is the application of science and technology to predict the state of the atmosphere for a given location. Ancient weather forecasting methods usually relied on observed patterns of events, also termed pattern recognition. For example, it might be observed that if the sunset was particularly red, the following day often brought fair weather. However, not all of these predictions prove reliable. Here this system will predict weather based on parameters such as temperature, humidity and wind. This system is a web application with effective graphical user interface. User will login to the system using his user ID and password. User will enter current temperature; humidity and wind, System will take this parameter and will predict weather from previous data in database. The role of the admin is to add previous weather data in database, so that system will calculate weather based on these data.
Weather forecasting system takes parameters such as temperature, humidity, and wind and will forecast weather based on previous record therefore this prediction will prove reliable. This system can be used in Air Traffic, Marine, Agriculture, Forestry, Military, and Navy etc. Forecasting the temperature and rain on a particular day and date is the main aim of this paper. In the paper we forecast rain and temperature for Europe; year up to 2051 and also we forecast temperature of world; year up to 2100.Our paper is aimed to provide real time weather forecast service at finest granularity level with recommendations. We grab user's location (longitude, latitude) using GPS data service whenever user requests for our services. Our system will process the users query and will mine the data from our repository to draw appropriate results. Users will be provided with recommendations also and that is the key facility

IJCRT2205936
International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org h907 of our service. Personalized forecast is generated for each individual user based on their location.

LITERATURE REVIEW:-
Weather forecasting has been one of the most challenging difficulties around the world because of both its practical value in popular scope for scientific study and meteorology. Weather is a continuous, dynamic, multidimensional chaotic process, and data-intensive and these properties make weather forecasting a stimulating challenge. It is one of the most imperious and demanding operational responsibilities that must be carried out by many meteorological services all over the globe. Various In the year 1999 Palmer, N.T said that ensemble forecasts as input to a simple decision-model analysis, it is shown that probability forecasts of weather and climate have greater potential economic value than corresponding single deterministic forecasts with uncertain accuracy. In

A. Decision Tree
Decision Tree was developed to overcome the drawbacks of ID3 algorithm. C4.5 utilizes the benefits of greedy approach and uses a series of rules for classification. Although this approach gives a high classification accuracy rate it fails to respond to noisy data. Gain is the main metric used in the decision tree to decide the root node attribute.

B. Naive Bayes
Naïve Bayes is a brute-force method for training the model. The underlying principle behind Naïve Bayesian classifier is Bayes Theorem. For the classification problem, each predictor attribute was consider separately with class label for model construction using training dataset. Predictor attribute includes the area, service calls, evening calls night calls etc. Apply the conditional probability for each attribute belongs to all the predictor attributes given Class label represents churn. The disadvantage of this methods is, it is not suited for the large dataset.

C. Support Vector Machine
SVM algorithm was proposed by Boser, Guyon, and Vapnik. It was very well used for both classification and regression problem. SVM maps all the data points to a higher dimensional plane to make the data points linear separable. The plane which divides data points is known as hyper plane. It can be used for small dataset to give an optimal solution. SVM cannot be more effective for noisy data. SVM model tries to find out the churn and non-churn customer. In order to divide the dataset into churner and non-churner group, first it will take all the data points in ndimensional plane and divide the data points into churner and non-churner group based on maximum marginal hyper plane.
Based on the maximum marginal hyper plane it will divide the data points into churner and non-churner group. Here n represents the number of predictor variable associated with the dataset.

D. Bagging
Bagging ( That, bagging is effective, because it predicts the test instances using the classifier which has more accuracy from the bag of classifier, Bagging requires heavier computational resource for the Model construction.

E. Boosting
Boosting Ensemble technique is designed in such a way that it will maintain a weight for each training tuple. After a classifier is learned from the training tuple, weights are updated for the subsequent classifier.

CONCLUSION AND FUTURE SCOPE CONCLUSION
Traditionally, weather forecasting has always been performed by physically simulating the atmosphere as a fluid and then the current state of the atmosphere would be sampled. In the previous system the future state of the atmosphere is computed by solving numerical equations of thermodynamics. But this model is sometimes unstable under disturbances and uncertainties while measuring the initial conditions of the atmosphere. This leads to an incomplete understanding of the atmospheric processes, so it restricts weather prediction.
Our proposed solution of using Machine learning for weather predicting is relatively robust to most atmospheric disturbances when compared to traditional methods. Another advantage of using machine learning is that it is not dependent on the physical laws of atmospheric processes. In the long run weather prediction using Machine Learning has a lot of advantages and thus it should be used globally.

FUTURE SCOPE:
The project mainly focuses on forecasting weather conditions using historical data. This can be done by extracting knowledge from this given data by using techniques such as association, pattern recognition, nearest neighbor etc.
Disaster Mitigation: Predicting storms, floods, droughts Helping those sectors which are most dependent on weather such as agriculture, aviation also depends on weather conditions.