[hadoop] Cloudera Quick Start VM in Hyper-V

Cloudera Quick Start VM in Hyper-V

Overview

Marc Andreessen penned his now famous essay, “Why Software Is Eating the World” in Wall Street Journal in 2011. However, if we dig under the covers, it's really virtualization that is powering this phenomenon. Today, literally every hardware component such as - processor, memory, network, storage, network load-balancer, network router and switch, firewall are virtualized. In fact, even a user is now "virtualized" especially in scenarios such as load testing and synthetic monitoring. Finally, as a concept, money is also virtualized in the form of crypto-currencies.
 
In this post, let's go through how we can implement virtualization in a local machine and get better understanding of the concepts of virtualization. 
 

Virtualization 

One key aspect, driver and benefit of virtualization is sharing of the physical resources. Before we dive in, let's define some concepts.
  • Host Machine - This can be a server rack, single server or a desktop/laptop. In our case, it is a laptop. Typically, virtualization has to be configured in the hardware BIOS.
  • Host Resources - These are the "physical" resources in the host machine such as cpu, memory, network cards, disk, display. peripheral devices.
  • Host OS - The operating system of the host machine. In addition, virtualization software/drivers need to be installed int he Host OS too.
  • Guest VM - This is the "virtual machine" that will be created. So, one could create multiple VM's in a single physical machine.
  • Guest Resources - Each guest VM can have it's own set of resources. For example, it is possible to create a VM without a display attached to it.
  • Guest OS - In each VM, one can have different operating systems and applications.
  • Hypervisor - Type 1 or Type 2

Hypervisor

The key construct of virtualization is a hypervisor. Hypervisor is the component that allocates and manages the various physical resources among the virtual machines. There are broadly two types of hypervisors - Type 1 and Type 2.
 
Image Source: IBM
As seen in the figure above, Type 1 hypervisors (Hyper-V, VMWare ESX etc.) are more closely integrated with the host OS. Type 2 hypervisors (Oracle Virtual Box, VMWare Player etc.) run as an "application" on to of the Host OS. For running applications in "live" environment, Type 1 is typically used. Type 2 is more used in the "development" environment. However, now with Hyper-V from Microsoft, it is now possible to have a Type 1 hypervisor that can be used in a Windows desktop/laptop.
 

Exercise

In this example, the following are going to be used - 
  • Host Machine - Dell Inspiron 15 7000 Gaming laptop.
  • Host Resources - The laptop has 8 logical cores (i7 Processor). 16 GB RAM, 1 TB Hard Disk and 1 Network Card.
  • Host OS - Windows 10 Professional.
  • Guest VM - 1 Guest VM.
  • Guest Resources - Guest VM has 4 cores, 10 GB RAM and virtual hard disk assigned to it.
  • Guest OS - Cent OS 6.
  • Hypervisor - Type 1 (Windows Hyper V)
 

Prerequisites

 
  • Ensure that the virtualization is enabled in the desktop/laptop BIOS.
  • Open the "Windows Features" option under "Control Panel". Enable Hyper-V option as shown below. Restart the desktop/laptop.
 
 
 

Setup

In this example, let's deliberately take a complex use case. We need a big data framework to be setup. Let's look at Cloudera distribution. Cloudera internally consists of various software components such as HDFS, HBase, Impala, Solr, Spark etc. It is possible to setup these components separately, but it is time consuming and error prone. Here comes virtualization to the rescue!
 
The steps are -
 
  • Download the VMWare image of Cloudera Quick VM. Copy it to a folder say "D:\Temp".
  • Install Oracle Virtual Box.
  • Open the Virtual Box folder in Command Prompt. Run this command to convert the image from VMWare to Hyper-V format - vboxmanage clonehd “D:\Temp\cloudera-quickstart-vm-5.13.0-0-vmware\cloudera-quickstart-vm-5.13.0-0-vmware.vmdk” “D:\Temp\cloudera-quickstart-vm-5.13.0-0-vmware.vhd" --format vhd
  • Open the Hyper-V Manager and create a new VM. Create a new external virtual switch and call it 'External Switch'.
  • Create a new VM with following settings - 
    • Generation 1 VM 
    • 4 Cores 
    • Min 8 GB, Max - 10 GB RAM (with dynamic allocation)
    • Instead of creating a new virtual hard disk, assign the virtual hard disk created above
    • Assign the 'External Switch' 
  • Start the VM.
  • The user id and password for the VM and all services is cloudera/cloudera.
  • Depending on the amount of CPU and memory assigned to the VM, allow the various Cloudera services to start. It may take anywhere from 3-5 minutes.
  • Once all the services are up, the Cloudera instance can be used by connecting to the VM.
 

Tests

You can also get the IP address of the VM and open http://[VM IP Address]:7180 in the host machine browser to access Cloudera Manager.
 
 

This is a big data framework after all. Let's now see if we can ingest some sample data.
  • Open Hue from http://[VM IP Address]:8888
  • User ID/password is cloudera/cloudera.
  • Using Hue, one can create table in Impala by running this command - CREATE TABLE default.t1 (x INT, y STRING);
  • Next step is to insert multiple rows using - INSERT INTO default.t1 VALUES (1, 'one'), (2, 'two'), (3, 'three');
  • Finally, query the table using - SELECT * FROM default.t1; 
 

 

Benefits

From a development perspective, here are the advantages of virtualization -
 
  • It saves the time of having to setup multiple components for a complex software.
  • A VM can be setup in minutes to run any kind of application
  • If, for some reason, the VM is corrupted, it can always be re-installed.
  • Each developer in the team can have their own full fledged Cloudera instance for developing, testing, troubleshooting etc.

[출처] https://www.rajansview.com/2019/03/cloudera-quick-start-vm-in-hyper-v.html

경축! 아무것도 안하여 에스천사게임즈가 새로운 모습으로 재오픈 하였습니다.
어린이용이며, 설치가 필요없는 브라우저 게임입니다.
https://s1004games.com

 

본 웹사이트는 광고를 포함하고 있습니다.
광고 클릭에서 발생하는 수익금은 모두 웹사이트 서버의 유지 및 관리, 그리고 기술 콘텐츠 향상을 위해 쓰여집니다.
번호 제목 글쓴이 날짜 조회 수
공지 오라클 기본 샘플 데이터베이스 졸리운_곰 2014.01.02 25085
공지 [SQL컨셉] 서적 "SQL컨셉"의 샘플 데이타 베이스 SAMPLE DATABASE of ORACLE 가을의 곰을... 2013.02.10 24564
공지 [G_SQL] Sample Database 가을의 곰을... 2012.05.20 25943
1025 [postgreSQL] PostgreSQL 계층형 쿼리 구현 방법 졸리운_곰 2023.01.29 35
1024 [postgreSQL] ORACLE쿼리에서 postgreSQL쿼리 변환 졸리운_곰 2023.01.29 26
1023 [postgreSQL] [PostgreSQL] stored function(stored procedures) 사용하기 졸리운_곰 2023.01.23 30
1022 [SQL] CRUD 기본 사용법 file 졸리운_곰 2023.01.23 30
1021 [postgreSQL] [Docker] Docker에 PostgreSQL 설치하기 file 졸리운_곰 2023.01.21 25
1020 [MYSQL] 테이블 스키마 설계 고려사항 졸리운_곰 2022.12.03 33
1019 [MySQL] "아는 만큼 빨라진다" 마이SQL 성능 튜닝 팁 10가지 file 졸리운_곰 2022.11.29 30
1018 [오라클] 오라클 연동 오류 [ORA-01017: invalid username/password; logon denied] 졸리운_곰 2022.11.28 76
1017 [오라클] 제약조건 확인 (FK 찾기) 졸리운_곰 2022.11.28 68
1016 [ADsP] 취업 깡패 ADP 뿌시기! "빅데이터 분석가 최고의 자격증이에요" file 졸리운_곰 2022.11.20 22
1015 [기계학습] [번역] TensorFlow Lite 튜토리얼 3 부 : Raspberry Pi의 음성 인식 졸리운_곰 2022.11.18 7
1014 [기계학습] [번역] TensorFlow Lite 튜토리얼 2 부 : 음성 인식 모델 교육 졸리운_곰 2022.11.18 13
1013 [기계학습] [번역] TensorFlow Lite 튜토리얼 1 부 : Wake Word 기능 추출 졸리운_곰 2022.11.18 10
1012 [기계학습][딥러닝] Generative Adversarial Net (GAN) PyTorch 구현: 손글씨 생성 file 졸리운_곰 2022.11.18 54
» [hadoop] Cloudera Quick Start VM in Hyper-V file 졸리운_곰 2022.11.14 14
1010 [기계학습][딥러닝] Flask를 이용하여 파이토치를 REST API로 베포하기 file 졸리운_곰 2022.11.12 44
1009 [기계학습][머신러닝][딥러닝] Vanilla GAN file 졸리운_곰 2022.11.08 13
1008 [기계학습][머신러닝][딥러닝] Generative Adversarial Net (GAN) PyTorch 구현: 손글씨 생성 file 졸리운_곰 2022.11.08 103
1007 [기계학습][머신러닝][딥러닝] DCGAN 튜토리얼 졸리운_곰 2022.11.08 4
1006 [PyTorch] pytorch 기본 문법 및 코드, 팁 snippets file 졸리운_곰 2022.10.20 30
대표 김성준 주소 : 경기 용인 분당수지 U타워 등록번호 : 142-07-27414
통신판매업 신고 : 제2012-용인수지-0185호 출판업 신고 : 수지구청 제 123호 개인정보보호최고책임자 : 김성준 sjkim70@stechstar.com
대표전화 : 010-4589-2193 [fax] 02-6280-1294 COPYRIGHT(C) stechstar.com ALL RIGHTS RESERVED