{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Speech Of DR APJ Abdul Kalam\n",
    "paragraph = \"\"\"I have three visions for India. In 3000 years of our history, people from all over \n",
    "               the world have come and invaded us, captured our lands, conquered our minds. \n",
    "               From Alexander onwards, the Greeks, the Turks, the Moguls, the Portuguese, the British,\n",
    "               the French, the Dutch, all of them came and looted us, took over what was ours. \n",
    "               Yet we have not done this to any other nation. We have not conquered anyone. \n",
    "               We have not grabbed their land, their culture, \n",
    "               their history and tried to enforce our way of life on them. \n",
    "               Why? Because we respect the freedom of others.That is why my \n",
    "               first vision is that of freedom. I believe that India got its first vision of \n",
    "               this in 1857, when we started the War of Independence. It is this freedom that\n",
    "               we must protect and nurture and build on. If we are not free, no one will respect us.\n",
    "               My second vision for India’s development. For fifty years we have been a developing nation.\n",
    "               It is time we see ourselves as a developed nation. We are among the top 5 nations of the world\n",
    "               in terms of GDP. We have a 10 percent growth rate in most areas. Our poverty levels are falling.\n",
    "               Our achievements are being globally recognised today. Yet we lack the self-confidence to\n",
    "               see ourselves as a developed nation, self-reliant and self-assured. Isn’t this incorrect?\n",
    "               I have a third vision. India must stand up to the world. Because I believe that unless India \n",
    "               stands up to the world, no one will respect us. Only strength respects strength. We must be \n",
    "               strong not only as a military power but also as an economic power. Both must go hand-in-hand. \n",
    "               My good fortune was to have worked with three great minds. Dr. Vikram Sarabhai of the Dept. of \n",
    "               space, Professor Satish Dhawan, who succeeded him and Dr. Brahm Prakash, father of nuclear material.\n",
    "               I was lucky to have worked with all three of them closely and consider this the great opportunity of my life. \n",
    "               I see four milestones in my career\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nltk.stem import PorterStemmer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nltk.corpus import stopwords"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[nltk_data] Downloading package stopwords to\n",
      "[nltk_data]     C:\\Users\\win10\\AppData\\Roaming\\nltk_data...\n",
      "[nltk_data]   Package stopwords is already up-to-date!\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import nltk\n",
    "nltk.download('stopwords')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['i',\n",
       " 'me',\n",
       " 'my',\n",
       " 'myself',\n",
       " 'we',\n",
       " 'our',\n",
       " 'ours',\n",
       " 'ourselves',\n",
       " 'you',\n",
       " \"you're\",\n",
       " \"you've\",\n",
       " \"you'll\",\n",
       " \"you'd\",\n",
       " 'your',\n",
       " 'yours',\n",
       " 'yourself',\n",
       " 'yourselves',\n",
       " 'he',\n",
       " 'him',\n",
       " 'his',\n",
       " 'himself',\n",
       " 'she',\n",
       " \"she's\",\n",
       " 'her',\n",
       " 'hers',\n",
       " 'herself',\n",
       " 'it',\n",
       " \"it's\",\n",
       " 'its',\n",
       " 'itself',\n",
       " 'they',\n",
       " 'them',\n",
       " 'their',\n",
       " 'theirs',\n",
       " 'themselves',\n",
       " 'what',\n",
       " 'which',\n",
       " 'who',\n",
       " 'whom',\n",
       " 'this',\n",
       " 'that',\n",
       " \"that'll\",\n",
       " 'these',\n",
       " 'those',\n",
       " 'am',\n",
       " 'is',\n",
       " 'are',\n",
       " 'was',\n",
       " 'were',\n",
       " 'be',\n",
       " 'been',\n",
       " 'being',\n",
       " 'have',\n",
       " 'has',\n",
       " 'had',\n",
       " 'having',\n",
       " 'do',\n",
       " 'does',\n",
       " 'did',\n",
       " 'doing',\n",
       " 'a',\n",
       " 'an',\n",
       " 'the',\n",
       " 'and',\n",
       " 'but',\n",
       " 'if',\n",
       " 'or',\n",
       " 'because',\n",
       " 'as',\n",
       " 'until',\n",
       " 'while',\n",
       " 'of',\n",
       " 'at',\n",
       " 'by',\n",
       " 'for',\n",
       " 'with',\n",
       " 'about',\n",
       " 'against',\n",
       " 'between',\n",
       " 'into',\n",
       " 'through',\n",
       " 'during',\n",
       " 'before',\n",
       " 'after',\n",
       " 'above',\n",
       " 'below',\n",
       " 'to',\n",
       " 'from',\n",
       " 'up',\n",
       " 'down',\n",
       " 'in',\n",
       " 'out',\n",
       " 'on',\n",
       " 'off',\n",
       " 'over',\n",
       " 'under',\n",
       " 'again',\n",
       " 'further',\n",
       " 'then',\n",
       " 'once',\n",
       " 'here',\n",
       " 'there',\n",
       " 'when',\n",
       " 'where',\n",
       " 'why',\n",
       " 'how',\n",
       " 'all',\n",
       " 'any',\n",
       " 'both',\n",
       " 'each',\n",
       " 'few',\n",
       " 'more',\n",
       " 'most',\n",
       " 'other',\n",
       " 'some',\n",
       " 'such',\n",
       " 'no',\n",
       " 'nor',\n",
       " 'not',\n",
       " 'only',\n",
       " 'own',\n",
       " 'same',\n",
       " 'so',\n",
       " 'than',\n",
       " 'too',\n",
       " 'very',\n",
       " 's',\n",
       " 't',\n",
       " 'can',\n",
       " 'will',\n",
       " 'just',\n",
       " 'don',\n",
       " \"don't\",\n",
       " 'should',\n",
       " \"should've\",\n",
       " 'now',\n",
       " 'd',\n",
       " 'll',\n",
       " 'm',\n",
       " 'o',\n",
       " 're',\n",
       " 've',\n",
       " 'y',\n",
       " 'ain',\n",
       " 'aren',\n",
       " \"aren't\",\n",
       " 'couldn',\n",
       " \"couldn't\",\n",
       " 'didn',\n",
       " \"didn't\",\n",
       " 'doesn',\n",
       " \"doesn't\",\n",
       " 'hadn',\n",
       " \"hadn't\",\n",
       " 'hasn',\n",
       " \"hasn't\",\n",
       " 'haven',\n",
       " \"haven't\",\n",
       " 'isn',\n",
       " \"isn't\",\n",
       " 'ma',\n",
       " 'mightn',\n",
       " \"mightn't\",\n",
       " 'mustn',\n",
       " \"mustn't\",\n",
       " 'needn',\n",
       " \"needn't\",\n",
       " 'shan',\n",
       " \"shan't\",\n",
       " 'shouldn',\n",
       " \"shouldn't\",\n",
       " 'wasn',\n",
       " \"wasn't\",\n",
       " 'weren',\n",
       " \"weren't\",\n",
       " 'won',\n",
       " \"won't\",\n",
       " 'wouldn',\n",
       " \"wouldn't\"]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stopwords.words('english')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['إذ',\n",
       " 'إذا',\n",
       " 'إذما',\n",
       " 'إذن',\n",
       " 'أف',\n",
       " 'أقل',\n",
       " 'أكثر',\n",
       " 'ألا',\n",
       " 'إلا',\n",
       " 'التي',\n",
       " 'الذي',\n",
       " 'الذين',\n",
       " 'اللاتي',\n",
       " 'اللائي',\n",
       " 'اللتان',\n",
       " 'اللتيا',\n",
       " 'اللتين',\n",
       " 'اللذان',\n",
       " 'اللذين',\n",
       " 'اللواتي',\n",
       " 'إلى',\n",
       " 'إليك',\n",
       " 'إليكم',\n",
       " 'إليكما',\n",
       " 'إليكن',\n",
       " 'أم',\n",
       " 'أما',\n",
       " 'أما',\n",
       " 'إما',\n",
       " 'أن',\n",
       " 'إن',\n",
       " 'إنا',\n",
       " 'أنا',\n",
       " 'أنت',\n",
       " 'أنتم',\n",
       " 'أنتما',\n",
       " 'أنتن',\n",
       " 'إنما',\n",
       " 'إنه',\n",
       " 'أنى',\n",
       " 'أنى',\n",
       " 'آه',\n",
       " 'آها',\n",
       " 'أو',\n",
       " 'أولاء',\n",
       " 'أولئك',\n",
       " 'أوه',\n",
       " 'آي',\n",
       " 'أي',\n",
       " 'أيها',\n",
       " 'إي',\n",
       " 'أين',\n",
       " 'أين',\n",
       " 'أينما',\n",
       " 'إيه',\n",
       " 'بخ',\n",
       " 'بس',\n",
       " 'بعد',\n",
       " 'بعض',\n",
       " 'بك',\n",
       " 'بكم',\n",
       " 'بكم',\n",
       " 'بكما',\n",
       " 'بكن',\n",
       " 'بل',\n",
       " 'بلى',\n",
       " 'بما',\n",
       " 'بماذا',\n",
       " 'بمن',\n",
       " 'بنا',\n",
       " 'به',\n",
       " 'بها',\n",
       " 'بهم',\n",
       " 'بهما',\n",
       " 'بهن',\n",
       " 'بي',\n",
       " 'بين',\n",
       " 'بيد',\n",
       " 'تلك',\n",
       " 'تلكم',\n",
       " 'تلكما',\n",
       " 'ته',\n",
       " 'تي',\n",
       " 'تين',\n",
       " 'تينك',\n",
       " 'ثم',\n",
       " 'ثمة',\n",
       " 'حاشا',\n",
       " 'حبذا',\n",
       " 'حتى',\n",
       " 'حيث',\n",
       " 'حيثما',\n",
       " 'حين',\n",
       " 'خلا',\n",
       " 'دون',\n",
       " 'ذا',\n",
       " 'ذات',\n",
       " 'ذاك',\n",
       " 'ذان',\n",
       " 'ذانك',\n",
       " 'ذلك',\n",
       " 'ذلكم',\n",
       " 'ذلكما',\n",
       " 'ذلكن',\n",
       " 'ذه',\n",
       " 'ذو',\n",
       " 'ذوا',\n",
       " 'ذواتا',\n",
       " 'ذواتي',\n",
       " 'ذي',\n",
       " 'ذين',\n",
       " 'ذينك',\n",
       " 'ريث',\n",
       " 'سوف',\n",
       " 'سوى',\n",
       " 'شتان',\n",
       " 'عدا',\n",
       " 'عسى',\n",
       " 'عل',\n",
       " 'على',\n",
       " 'عليك',\n",
       " 'عليه',\n",
       " 'عما',\n",
       " 'عن',\n",
       " 'عند',\n",
       " 'غير',\n",
       " 'فإذا',\n",
       " 'فإن',\n",
       " 'فلا',\n",
       " 'فمن',\n",
       " 'في',\n",
       " 'فيم',\n",
       " 'فيما',\n",
       " 'فيه',\n",
       " 'فيها',\n",
       " 'قد',\n",
       " 'كأن',\n",
       " 'كأنما',\n",
       " 'كأي',\n",
       " 'كأين',\n",
       " 'كذا',\n",
       " 'كذلك',\n",
       " 'كل',\n",
       " 'كلا',\n",
       " 'كلاهما',\n",
       " 'كلتا',\n",
       " 'كلما',\n",
       " 'كليكما',\n",
       " 'كليهما',\n",
       " 'كم',\n",
       " 'كم',\n",
       " 'كما',\n",
       " 'كي',\n",
       " 'كيت',\n",
       " 'كيف',\n",
       " 'كيفما',\n",
       " 'لا',\n",
       " 'لاسيما',\n",
       " 'لدى',\n",
       " 'لست',\n",
       " 'لستم',\n",
       " 'لستما',\n",
       " 'لستن',\n",
       " 'لسن',\n",
       " 'لسنا',\n",
       " 'لعل',\n",
       " 'لك',\n",
       " 'لكم',\n",
       " 'لكما',\n",
       " 'لكن',\n",
       " 'لكنما',\n",
       " 'لكي',\n",
       " 'لكيلا',\n",
       " 'لم',\n",
       " 'لما',\n",
       " 'لن',\n",
       " 'لنا',\n",
       " 'له',\n",
       " 'لها',\n",
       " 'لهم',\n",
       " 'لهما',\n",
       " 'لهن',\n",
       " 'لو',\n",
       " 'لولا',\n",
       " 'لوما',\n",
       " 'لي',\n",
       " 'لئن',\n",
       " 'ليت',\n",
       " 'ليس',\n",
       " 'ليسا',\n",
       " 'ليست',\n",
       " 'ليستا',\n",
       " 'ليسوا',\n",
       " 'ما',\n",
       " 'ماذا',\n",
       " 'متى',\n",
       " 'مذ',\n",
       " 'مع',\n",
       " 'مما',\n",
       " 'ممن',\n",
       " 'من',\n",
       " 'منه',\n",
       " 'منها',\n",
       " 'منذ',\n",
       " 'مه',\n",
       " 'مهما',\n",
       " 'نحن',\n",
       " 'نحو',\n",
       " 'نعم',\n",
       " 'ها',\n",
       " 'هاتان',\n",
       " 'هاته',\n",
       " 'هاتي',\n",
       " 'هاتين',\n",
       " 'هاك',\n",
       " 'هاهنا',\n",
       " 'هذا',\n",
       " 'هذان',\n",
       " 'هذه',\n",
       " 'هذي',\n",
       " 'هذين',\n",
       " 'هكذا',\n",
       " 'هل',\n",
       " 'هلا',\n",
       " 'هم',\n",
       " 'هما',\n",
       " 'هن',\n",
       " 'هنا',\n",
       " 'هناك',\n",
       " 'هنالك',\n",
       " 'هو',\n",
       " 'هؤلاء',\n",
       " 'هي',\n",
       " 'هيا',\n",
       " 'هيت',\n",
       " 'هيهات',\n",
       " 'والذي',\n",
       " 'والذين',\n",
       " 'وإذ',\n",
       " 'وإذا',\n",
       " 'وإن',\n",
       " 'ولا',\n",
       " 'ولكن',\n",
       " 'ولو',\n",
       " 'وما',\n",
       " 'ومن',\n",
       " 'وهو',\n",
       " 'يا',\n",
       " 'أبٌ',\n",
       " 'أخٌ',\n",
       " 'حمٌ',\n",
       " 'فو',\n",
       " 'أنتِ',\n",
       " 'يناير',\n",
       " 'فبراير',\n",
       " 'مارس',\n",
       " 'أبريل',\n",
       " 'مايو',\n",
       " 'يونيو',\n",
       " 'يوليو',\n",
       " 'أغسطس',\n",
       " 'سبتمبر',\n",
       " 'أكتوبر',\n",
       " 'نوفمبر',\n",
       " 'ديسمبر',\n",
       " 'جانفي',\n",
       " 'فيفري',\n",
       " 'مارس',\n",
       " 'أفريل',\n",
       " 'ماي',\n",
       " 'جوان',\n",
       " 'جويلية',\n",
       " 'أوت',\n",
       " 'كانون',\n",
       " 'شباط',\n",
       " 'آذار',\n",
       " 'نيسان',\n",
       " 'أيار',\n",
       " 'حزيران',\n",
       " 'تموز',\n",
       " 'آب',\n",
       " 'أيلول',\n",
       " 'تشرين',\n",
       " 'دولار',\n",
       " 'دينار',\n",
       " 'ريال',\n",
       " 'درهم',\n",
       " 'ليرة',\n",
       " 'جنيه',\n",
       " 'قرش',\n",
       " 'مليم',\n",
       " 'فلس',\n",
       " 'هللة',\n",
       " 'سنتيم',\n",
       " 'يورو',\n",
       " 'ين',\n",
       " 'يوان',\n",
       " 'شيكل',\n",
       " 'واحد',\n",
       " 'اثنان',\n",
       " 'ثلاثة',\n",
       " 'أربعة',\n",
       " 'خمسة',\n",
       " 'ستة',\n",
       " 'سبعة',\n",
       " 'ثمانية',\n",
       " 'تسعة',\n",
       " 'عشرة',\n",
       " 'أحد',\n",
       " 'اثنا',\n",
       " 'اثني',\n",
       " 'إحدى',\n",
       " 'ثلاث',\n",
       " 'أربع',\n",
       " 'خمس',\n",
       " 'ست',\n",
       " 'سبع',\n",
       " 'ثماني',\n",
       " 'تسع',\n",
       " 'عشر',\n",
       " 'ثمان',\n",
       " 'سبت',\n",
       " 'أحد',\n",
       " 'اثنين',\n",
       " 'ثلاثاء',\n",
       " 'أربعاء',\n",
       " 'خميس',\n",
       " 'جمعة',\n",
       " 'أول',\n",
       " 'ثان',\n",
       " 'ثاني',\n",
       " 'ثالث',\n",
       " 'رابع',\n",
       " 'خامس',\n",
       " 'سادس',\n",
       " 'سابع',\n",
       " 'ثامن',\n",
       " 'تاسع',\n",
       " 'عاشر',\n",
       " 'حادي',\n",
       " 'أ',\n",
       " 'ب',\n",
       " 'ت',\n",
       " 'ث',\n",
       " 'ج',\n",
       " 'ح',\n",
       " 'خ',\n",
       " 'د',\n",
       " 'ذ',\n",
       " 'ر',\n",
       " 'ز',\n",
       " 'س',\n",
       " 'ش',\n",
       " 'ص',\n",
       " 'ض',\n",
       " 'ط',\n",
       " 'ظ',\n",
       " 'ع',\n",
       " 'غ',\n",
       " 'ف',\n",
       " 'ق',\n",
       " 'ك',\n",
       " 'ل',\n",
       " 'م',\n",
       " 'ن',\n",
       " 'ه',\n",
       " 'و',\n",
       " 'ي',\n",
       " 'ء',\n",
       " 'ى',\n",
       " 'آ',\n",
       " 'ؤ',\n",
       " 'ئ',\n",
       " 'أ',\n",
       " 'ة',\n",
       " 'ألف',\n",
       " 'باء',\n",
       " 'تاء',\n",
       " 'ثاء',\n",
       " 'جيم',\n",
       " 'حاء',\n",
       " 'خاء',\n",
       " 'دال',\n",
       " 'ذال',\n",
       " 'راء',\n",
       " 'زاي',\n",
       " 'سين',\n",
       " 'شين',\n",
       " 'صاد',\n",
       " 'ضاد',\n",
       " 'طاء',\n",
       " 'ظاء',\n",
       " 'عين',\n",
       " 'غين',\n",
       " 'فاء',\n",
       " 'قاف',\n",
       " 'كاف',\n",
       " 'لام',\n",
       " 'ميم',\n",
       " 'نون',\n",
       " 'هاء',\n",
       " 'واو',\n",
       " 'ياء',\n",
       " 'همزة',\n",
       " 'ي',\n",
       " 'نا',\n",
       " 'ك',\n",
       " 'كن',\n",
       " 'ه',\n",
       " 'إياه',\n",
       " 'إياها',\n",
       " 'إياهما',\n",
       " 'إياهم',\n",
       " 'إياهن',\n",
       " 'إياك',\n",
       " 'إياكما',\n",
       " 'إياكم',\n",
       " 'إياك',\n",
       " 'إياكن',\n",
       " 'إياي',\n",
       " 'إيانا',\n",
       " 'أولالك',\n",
       " 'تانِ',\n",
       " 'تانِك',\n",
       " 'تِه',\n",
       " 'تِي',\n",
       " 'تَيْنِ',\n",
       " 'ثمّ',\n",
       " 'ثمّة',\n",
       " 'ذانِ',\n",
       " 'ذِه',\n",
       " 'ذِي',\n",
       " 'ذَيْنِ',\n",
       " 'هَؤلاء',\n",
       " 'هَاتانِ',\n",
       " 'هَاتِه',\n",
       " 'هَاتِي',\n",
       " 'هَاتَيْنِ',\n",
       " 'هَذا',\n",
       " 'هَذانِ',\n",
       " 'هَذِه',\n",
       " 'هَذِي',\n",
       " 'هَذَيْنِ',\n",
       " 'الألى',\n",
       " 'الألاء',\n",
       " 'أل',\n",
       " 'أنّى',\n",
       " 'أيّ',\n",
       " 'ّأيّان',\n",
       " 'أنّى',\n",
       " 'أيّ',\n",
       " 'ّأيّان',\n",
       " 'ذيت',\n",
       " 'كأيّ',\n",
       " 'كأيّن',\n",
       " 'بضع',\n",
       " 'فلان',\n",
       " 'وا',\n",
       " 'آمينَ',\n",
       " 'آهِ',\n",
       " 'آهٍ',\n",
       " 'آهاً',\n",
       " 'أُفٍّ',\n",
       " 'أُفٍّ',\n",
       " 'أفٍّ',\n",
       " 'أمامك',\n",
       " 'أمامكَ',\n",
       " 'أوّهْ',\n",
       " 'إلَيْكَ',\n",
       " 'إلَيْكَ',\n",
       " 'إليكَ',\n",
       " 'إليكنّ',\n",
       " 'إيهٍ',\n",
       " 'بخٍ',\n",
       " 'بسّ',\n",
       " 'بَسْ',\n",
       " 'بطآن',\n",
       " 'بَلْهَ',\n",
       " 'حاي',\n",
       " 'حَذارِ',\n",
       " 'حيَّ',\n",
       " 'حيَّ',\n",
       " 'دونك',\n",
       " 'رويدك',\n",
       " 'سرعان',\n",
       " 'شتانَ',\n",
       " 'شَتَّانَ',\n",
       " 'صهْ',\n",
       " 'صهٍ',\n",
       " 'طاق',\n",
       " 'طَق',\n",
       " 'عَدَسْ',\n",
       " 'كِخ',\n",
       " 'مكانَك',\n",
       " 'مكانَك',\n",
       " 'مكانَك',\n",
       " 'مكانكم',\n",
       " 'مكانكما',\n",
       " 'مكانكنّ',\n",
       " 'نَخْ',\n",
       " 'هاكَ',\n",
       " 'هَجْ',\n",
       " 'هلم',\n",
       " 'هيّا',\n",
       " 'هَيْهات',\n",
       " 'وا',\n",
       " 'واهاً',\n",
       " 'وراءَك',\n",
       " 'وُشْكَانَ',\n",
       " 'وَيْ',\n",
       " 'يفعلان',\n",
       " 'تفعلان',\n",
       " 'يفعلون',\n",
       " 'تفعلون',\n",
       " 'تفعلين',\n",
       " 'اتخذ',\n",
       " 'ألفى',\n",
       " 'تخذ',\n",
       " 'ترك',\n",
       " 'تعلَّم',\n",
       " 'جعل',\n",
       " 'حجا',\n",
       " 'حبيب',\n",
       " 'خال',\n",
       " 'حسب',\n",
       " 'خال',\n",
       " 'درى',\n",
       " 'رأى',\n",
       " 'زعم',\n",
       " 'صبر',\n",
       " 'ظنَّ',\n",
       " 'عدَّ',\n",
       " 'علم',\n",
       " 'غادر',\n",
       " 'ذهب',\n",
       " 'وجد',\n",
       " 'ورد',\n",
       " 'وهب',\n",
       " 'أسكن',\n",
       " 'أطعم',\n",
       " 'أعطى',\n",
       " 'رزق',\n",
       " 'زود',\n",
       " 'سقى',\n",
       " 'كسا',\n",
       " 'أخبر',\n",
       " 'أرى',\n",
       " 'أعلم',\n",
       " 'أنبأ',\n",
       " 'حدَث',\n",
       " 'خبَّر',\n",
       " 'نبَّا',\n",
       " 'أفعل به',\n",
       " 'ما أفعله',\n",
       " 'بئس',\n",
       " 'ساء',\n",
       " 'طالما',\n",
       " 'قلما',\n",
       " 'لات',\n",
       " 'لكنَّ',\n",
       " 'ءَ',\n",
       " 'أجل',\n",
       " 'إذاً',\n",
       " 'أمّا',\n",
       " 'إمّا',\n",
       " 'إنَّ',\n",
       " 'أنًّ',\n",
       " 'أى',\n",
       " 'إى',\n",
       " 'أيا',\n",
       " 'ب',\n",
       " 'ثمَّ',\n",
       " 'جلل',\n",
       " 'جير',\n",
       " 'رُبَّ',\n",
       " 'س',\n",
       " 'علًّ',\n",
       " 'ف',\n",
       " 'كأنّ',\n",
       " 'كلَّا',\n",
       " 'كى',\n",
       " 'ل',\n",
       " 'لات',\n",
       " 'لعلَّ',\n",
       " 'لكنَّ',\n",
       " 'لكنَّ',\n",
       " 'م',\n",
       " 'نَّ',\n",
       " 'هلّا',\n",
       " 'وا',\n",
       " 'أل',\n",
       " 'إلّا',\n",
       " 'ت',\n",
       " 'ك',\n",
       " 'لمّا',\n",
       " 'ن',\n",
       " 'ه',\n",
       " 'و',\n",
       " 'ا',\n",
       " 'ي',\n",
       " 'تجاه',\n",
       " 'تلقاء',\n",
       " 'جميع',\n",
       " 'حسب',\n",
       " 'سبحان',\n",
       " 'شبه',\n",
       " 'لعمر',\n",
       " 'مثل',\n",
       " 'معاذ',\n",
       " 'أبو',\n",
       " 'أخو',\n",
       " 'حمو',\n",
       " 'فو',\n",
       " 'مئة',\n",
       " 'مئتان',\n",
       " 'ثلاثمئة',\n",
       " 'أربعمئة',\n",
       " 'خمسمئة',\n",
       " 'ستمئة',\n",
       " 'سبعمئة',\n",
       " 'ثمنمئة',\n",
       " 'تسعمئة',\n",
       " 'مائة',\n",
       " 'ثلاثمائة',\n",
       " 'أربعمائة',\n",
       " 'خمسمائة',\n",
       " 'ستمائة',\n",
       " 'سبعمائة',\n",
       " 'ثمانمئة',\n",
       " 'تسعمائة',\n",
       " 'عشرون',\n",
       " 'ثلاثون',\n",
       " 'اربعون',\n",
       " 'خمسون',\n",
       " 'ستون',\n",
       " 'سبعون',\n",
       " 'ثمانون',\n",
       " 'تسعون',\n",
       " 'عشرين',\n",
       " 'ثلاثين',\n",
       " 'اربعين',\n",
       " 'خمسين',\n",
       " 'ستين',\n",
       " 'سبعين',\n",
       " 'ثمانين',\n",
       " 'تسعين',\n",
       " 'بضع',\n",
       " 'نيف',\n",
       " 'أجمع',\n",
       " 'جميع',\n",
       " 'عامة',\n",
       " 'عين',\n",
       " 'نفس',\n",
       " 'لا سيما',\n",
       " 'أصلا',\n",
       " 'أهلا',\n",
       " 'أيضا',\n",
       " 'بؤسا',\n",
       " 'بعدا',\n",
       " 'بغتة',\n",
       " 'تعسا',\n",
       " 'حقا',\n",
       " 'حمدا',\n",
       " 'خلافا',\n",
       " 'خاصة',\n",
       " 'دواليك',\n",
       " 'سحقا',\n",
       " 'سرا',\n",
       " 'سمعا',\n",
       " 'صبرا',\n",
       " 'صدقا',\n",
       " 'صراحة',\n",
       " 'طرا',\n",
       " 'عجبا',\n",
       " 'عيانا',\n",
       " 'غالبا',\n",
       " 'فرادى',\n",
       " 'فضلا',\n",
       " 'قاطبة',\n",
       " 'كثيرا',\n",
       " 'لبيك',\n",
       " 'معاذ',\n",
       " 'أبدا',\n",
       " 'إزاء',\n",
       " 'أصلا',\n",
       " 'الآن',\n",
       " 'أمد',\n",
       " 'أمس',\n",
       " 'آنفا',\n",
       " 'آناء',\n",
       " 'أنّى',\n",
       " 'أول',\n",
       " 'أيّان',\n",
       " 'تارة',\n",
       " 'ثمّ',\n",
       " 'ثمّة',\n",
       " 'حقا',\n",
       " 'صباح',\n",
       " 'مساء',\n",
       " 'ضحوة',\n",
       " 'عوض',\n",
       " 'غدا',\n",
       " 'غداة',\n",
       " 'قطّ',\n",
       " 'كلّما',\n",
       " 'لدن',\n",
       " 'لمّا',\n",
       " 'مرّة',\n",
       " 'قبل',\n",
       " 'خلف',\n",
       " 'أمام',\n",
       " 'فوق',\n",
       " 'تحت',\n",
       " 'يمين',\n",
       " 'شمال',\n",
       " 'ارتدّ',\n",
       " 'استحال',\n",
       " 'أصبح',\n",
       " 'أضحى',\n",
       " 'آض',\n",
       " 'أمسى',\n",
       " 'انقلب',\n",
       " 'بات',\n",
       " 'تبدّل',\n",
       " 'تحوّل',\n",
       " 'حار',\n",
       " 'رجع',\n",
       " 'راح',\n",
       " 'صار',\n",
       " 'ظلّ',\n",
       " 'عاد',\n",
       " 'غدا',\n",
       " 'كان',\n",
       " 'ما انفك',\n",
       " 'ما برح',\n",
       " 'مادام',\n",
       " 'مازال',\n",
       " 'مافتئ',\n",
       " 'ابتدأ',\n",
       " 'أخذ',\n",
       " 'اخلولق',\n",
       " 'أقبل',\n",
       " 'انبرى',\n",
       " 'أنشأ',\n",
       " 'أوشك',\n",
       " 'جعل',\n",
       " 'حرى',\n",
       " 'شرع',\n",
       " 'طفق',\n",
       " 'علق',\n",
       " 'قام',\n",
       " 'كرب',\n",
       " 'كاد',\n",
       " 'هبّ']"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stopwords.words('arabic')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nltk.stem import PorterStemmer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "stemmer=PorterStemmer()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "sentences=nltk.sent_tokenize(paragraph)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "list"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(sentences)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Apply Stopwords And Filter And then Apply Stemming\n",
    "\n",
    "for i in range(len(sentences)):\n",
    "    words=nltk.word_tokenize(sentences[i])\n",
    "    words=[stemmer.stem(word) for word in words if word not in set(stopwords.words('english'))]\n",
    "    sentences[i]=' '.join(words)# converting all the list of words into sentences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['I three vision india .',\n",
       " 'In 3000 year histori , peopl world come invad us , captur land , conquer mind .',\n",
       " 'from alexand onward , greek , turk , mogul , portugues , british , french , dutch , came loot us , took .',\n",
       " 'yet done nation .',\n",
       " 'We conquer anyon .',\n",
       " 'We grab land , cultur , histori tri enforc way life .',\n",
       " 'whi ?',\n",
       " 'becaus respect freedom others.that first vision freedom .',\n",
       " 'I believ india got first vision 1857 , start war independ .',\n",
       " 'It freedom must protect nurtur build .',\n",
       " 'If free , one respect us .',\n",
       " 'My second vision india ’ develop .',\n",
       " 'for fifti year develop nation .',\n",
       " 'It time see develop nation .',\n",
       " 'We among top 5 nation world term gdp .',\n",
       " 'We 10 percent growth rate area .',\n",
       " 'our poverti level fall .',\n",
       " 'our achiev global recognis today .',\n",
       " 'yet lack self-confid see develop nation , self-reli self-assur .',\n",
       " 'isn ’ incorrect ?',\n",
       " 'I third vision .',\n",
       " 'india must stand world .',\n",
       " 'becaus I believ unless india stand world , one respect us .',\n",
       " 'onli strength respect strength .',\n",
       " 'We must strong militari power also econom power .',\n",
       " 'both must go hand-in-hand .',\n",
       " 'My good fortun work three great mind .',\n",
       " 'dr. vikram sarabhai dept .',\n",
       " 'space , professor satish dhawan , succeed dr. brahm prakash , father nuclear materi .',\n",
       " 'I lucki work three close consid great opportun life .',\n",
       " 'I see four mileston career']"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sentences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nltk.stem import SnowballStemmer\n",
    "snowballstemmer=SnowballStemmer('english')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Apply Stopwords And Filter And then Apply Snowball Stemming\n",
    "\n",
    "for i in range(len(sentences)):\n",
    "    words=nltk.word_tokenize(sentences[i])\n",
    "    words=[snowballstemmer.stem(word) for word in words if word not in set(stopwords.words('english'))]\n",
    "    sentences[i]=' '.join(words)# converting all the list of words into sentences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['i three vision india .',\n",
       " 'in 3000 year histori , peopl world come invad us , captur land , conquer mind .',\n",
       " 'from alexand onward , greek , turk , mogul , portugues , british , french , dutch , came loot us , took .',\n",
       " 'yet done nation .',\n",
       " 'we conquer anyon .',\n",
       " 'we grab land , cultur , histori tri enforc way life .',\n",
       " 'whi ?',\n",
       " 'becaus respect freedom others.that first vision freedom .',\n",
       " 'i believ india got first vision 1857 , start war independ .',\n",
       " 'it freedom must protect nurtur build .',\n",
       " 'if free , one respect us .',\n",
       " 'my second vision india ’ develop .',\n",
       " 'for fifti year develop nation .',\n",
       " 'it time see develop nation .',\n",
       " 'we among top 5 nation world term gdp .',\n",
       " 'we 10 percent growth rate area .',\n",
       " 'our poverti level fall .',\n",
       " 'our achiev global recognis today .',\n",
       " 'yet lack self-confid see develop nation , self-reli self-assur .',\n",
       " 'isn ’ incorrect ?',\n",
       " 'i third vision .',\n",
       " 'india must stand world .',\n",
       " 'becaus i believ unless india stand world , one respect us .',\n",
       " 'onli strength respect strength .',\n",
       " 'we must strong militari power also econom power .',\n",
       " 'both must go hand-in-hand .',\n",
       " 'my good fortun work three great mind .',\n",
       " 'dr. vikram sarabhai dept .',\n",
       " 'space , professor satish dhawan , succeed dr. brahm prakash , father nuclear materi .',\n",
       " 'i lucki work three close consid great opportun life .',\n",
       " 'i see four mileston career']"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sentences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nltk.stem import WordNetLemmatizer\n",
    "lemmatizer=WordNetLemmatizer()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Apply Stopwords And Filter And then Apply Snowball Stemming\n",
    "\n",
    "for i in range(len(sentences)):\n",
    "    #sentences[i]=sentences[i].lower()\n",
    "    words=nltk.word_tokenize(sentences[i])\n",
    "    words=[lemmatizer.lemmatize(word.lower(),pos='v') for word in words if word not in set(stopwords.words('english'))]\n",
    "    sentences[i]=' '.join(words)# converting all the list of words into sentences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['i three visions india .',\n",
       " 'in 3000 years history , people world come invade us , capture land , conquer mind .',\n",
       " 'from alexander onwards , greeks , turks , moguls , portuguese , british , french , dutch , come loot us , take .',\n",
       " 'yet do nation .',\n",
       " 'we conquer anyone .',\n",
       " 'we grab land , culture , history try enforce way life .',\n",
       " 'why ?',\n",
       " 'because respect freedom others.that first vision freedom .',\n",
       " 'i believe india get first vision 1857 , start war independence .',\n",
       " 'it freedom must protect nurture build .',\n",
       " 'if free , one respect us .',\n",
       " 'my second vision india ’ development .',\n",
       " 'for fifty years develop nation .',\n",
       " 'it time see develop nation .',\n",
       " 'we among top 5 nations world term gdp .',\n",
       " 'we 10 percent growth rate areas .',\n",
       " 'our poverty level fall .',\n",
       " 'our achievements globally recognise today .',\n",
       " 'yet lack self-confidence see develop nation , self-reliant self-assured .',\n",
       " 'isn ’ incorrect ?',\n",
       " 'i third vision .',\n",
       " 'india must stand world .',\n",
       " 'because i believe unless india stand world , one respect us .',\n",
       " 'only strength respect strength .',\n",
       " 'we must strong military power also economic power .',\n",
       " 'both must go hand-in-hand .',\n",
       " 'my good fortune work three great mind .',\n",
       " 'dr. vikram sarabhai dept .',\n",
       " 'space , professor satish dhawan , succeed dr. brahm prakash , father nuclear material .',\n",
       " 'i lucky work three closely consider great opportunity life .',\n",
       " 'i see four milestones career']"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sentences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
