세션정보를 이용한 XML기반의 개인화된 로그 추출기법
- 원문 URL
Web-mining is the method which is based on web-log and use mining technique and which is for analyzing user's behavior. It has researched very actively theses days thanks to the increase of the Internet use. There are three main web-log collecting techniques mostly used in web-mining. They are sever-respect, client-respect and proxy-respect. And this thesis suggests sever-respect technique, that makes it possible to collect information effectively in realtime through the remote-agent method. One of the purposes of web-mining, personalization is the technique for giving the personalized service by analyzing individual behaviors. It is an very important element in recommendation system and eCRM. But It is difficult to collect information that helps identify a user just using IP information and it needs a preprocess to get rid of unnecessary information. It is difficult to need a preprocess in web-mining because formats that express web-log for web-mining are expressed in various ways. It has been researching method that analyze for content of web-log and transform into XML format in order to solve this problems. however, existent methods are able to web-mining by using only information included existent web-log because transform by using web-log file in the web-serve1r. In this thesis, we suggested method for create personalized Web-log based on XML from web server. we defined a login of user and logout as one session and transaction. If the session close, we create web-log based on XML by session and summarize web-log information by using site concept hierarchical structure. Web-log that comes from the suggested technique makes personalization easy by adding user's information and session information for identification. and suggested method removed an unnecessary data such as images and information of script as well as can reduce web-log data by summarize log information. Suggested system reduced capacity of web-log over 90%, and created personalized information by using user's login session information. As recording format was XML format, It transformed into different format more efficient and easy than existent methods.