There is a lot of mystery behind this picture. Can you find them all ?

Let me talk a bit about my latest little project I wrote in Python 3.

Problematic

As I'm part of the team behind Ultimate Host Blacklist along with other similar projects, we often encounter domains which are flagged as INVALID by PyFunceble.

So how can we convert each domain of the INVALID.txt generated by PyFunceble to IDNA format so we can reintroduce them for testing? My response? domain2idna.

Understanding Punycode and IDNA

Before continuing to read, I'll invite you to read the following from charset.com which explain Punycode/IDNA:

Punycode is an encoding syntax by which a Unicode (UTF-8) string of characters can be translated into the basic ASCII-characters permitted in network host names. Punycode is used for internationalized domain names, in short IDN or IDNA (Internationalizing Domain Names in Applications).

For example, when you would type café.com in your browser, your browser (which is the IDNA-enabled application) first converts the string to Punycode "xn--caf-dma.com", because the character 'é' is not allowed in regular domain names. Punycode domains won't work in very old browsers (Internet Explorer 6 and earlier).

Find more detailed info in the specification RFC 3492.

With another example, a domain like lifehacĸer.com (note the K) is actually translated to xn--lifehacer-1rb.com. You may not encounter those kinds of domains in your daily navigation over the Internet but when coming to hosts file, we encounter them almost everywhere.

Indeed, today IDNA formatted domain are mostly used for phishing like this hacker news article which explain a bit deeper the danger about IDNA.

About domain2idna

Domain2idna can be found on GitHub and is ready to use!

It can be used in two different ways: As an imported module or As a command-line command.

As an imported module

As Python allow an installed module to be imported here is an example of how to use domain2idna into an existing code or infrastructure.

#!/usr/bin/env python3

"""
This module uses domains2idna to convert a given domain.

Author:
    Nissar Chababy, @funilrys, contactTATAfunilrysTODTODcom

Contributors:
    Let's contribute to this example!!

Repository:
    https://github.com/funilrys/domain2idna
"""

from colorama import Style
from colorama import init as initiate

from domain2idna.core import Core

DOMAINS = [
    "bittréẋ.com", "bịllogram.com", "coinbȧse.com", "cryptopiạ.com", "cṙyptopia.com"
]

# We activate the automatical reset of string formatting
initiate(True)

# The following return the result of the whole loop.
print(
    "%sList of converted domains:%s %s"
    % (Style.BRIGHT, Style.RESET_ALL, Core(DOMAINS).to_idna())
)

# The following return the result of only one element.
print(
    "%sString representing a converted domain:%s %s"
    % (Style.BRIGHT, Style.RESET_ALL, Core(DOMAINS[-1]).to_idna())
)

That is a simple example to understand how the domain2idna works.

As you can note, domains2idna can return two type: a list or a str. Indeed, because I'll mostly use domain2idna to convert big lists, I wrote domain2idna so it can handle a given list and return a list with the converted domains. In the other side, as most people will want to get the IDNA format of only a domain, domain2idna also return an str if a string is given as input.

As a command-line

This part is less "interesting" but you may find the following usage which explains greatly how it's working.

usage: domain2idna [-h] [-d DOMAIN] [-f FILE] [-o OUTPUT]

domain2idna - A tool to convert a domain or a file with a list of domain to
the famous IDNA format.

optional arguments:
-h, --help            show this help message and exit
-d DOMAIN, --domain DOMAIN
                    Set the domain to convert.
-f FILE, --file FILE  Set the domain to convert.
-o OUTPUT, --output OUTPUT
                    Set the file where we write the converted domain(s).

Crafted with ♥ by Nissar Chababy (Funilrys)

As the conclusion, it was fun to write that little project and I hope that it'll help the Open-Source community!

That's it for the presentation of the project! A detailed code comment/explanation may come soon on the programming section.

Thanks for reading.

Reflexion, you are present stay for a short time but you still comes back.

Today I challenged myself to compute a way to find if a number is happy or not.

So what is a happy number?

To quote the wikipedia page:

A happy number is defined by the following process: Starting with any positive integer, replace the number by the sum of the squares of its digits in base-ten, and repeat the process until the number either equals 1 (where it will stay), or it loops endlessly in a cycle that does not include 1. Those numbers for which this process ends in 1 are happy numbers, while those that do not end in 1 are unhappy numbers (or sad numbers).

Now that we know what a happy number is let's code !

Let's get a positive number from the user

For this part I will use:

  • input() to get the input from the user.
  • int() to convert the output of input() to integer.
  • abs() to return the absolute value of the given number.
def given_number():
    """
    This function ask the user for a number and return it so it
    can be usable by other functions.

    Returns: int
        The given number.
    """

    # We use while because I want to infinite ask for a number
    # if the user do not give us a number.
    while True:
        # because the use can input anything, I choosed to
        # handle  the exception if the user do not give us a
        # number.
        # This way we keep asking untill the user give us a
        # number.
        try:
            initial_number = abs(int(input("Give us a number ")))
            break
        except ValueError:
            continue

    return initial_number

Spliting digits

Because to compute the happy numbers checking we have to calculate the sum of the square of its digits in base-ten. So, I choosed to directly split each digit of the sequence before doing the calculations.

For this part I will use:

  • int() to convert each digits to integer.
  • str() to convert the given number to string.
  • map() to iterate through the converted string.
def split_digits(digits):
    """
    This function split each digits of a given number.

    Argument:
        - digits: int
            The number to split.

    Returns: list
        A list with each digit.
    """

    # We convert the given number to string first.
    # Then we the iterate over each characters which are
    # converted as integer.
    # As map return a list, and int replace the current
    # iteration to integer, we get the list of each digits
    # of the number.
    return map(int, str(digits))

Calculation of the sum of the square of each digits

For this part I will use:

  • pow() to calculate the square of each digits.
def calculation(digits):
    """
    This function return the calculation of the sum of the square
    of each digits.

    Returns: int
        The result of the recursively sum of squares of each
        digits.
    """

    # This variable is used to store the calculation results.
    result = 0

    # We use for to iterate through the list of digits given by
    # split_digits().
    for digit in digits:

        # We append the square of the current digits to result
        # this way we can return the results once we finished
        # to iterate through the list of  digits.
        result += pow(digit, 2)

    return result

Is a number happy ?

Because I wanted to be eable to check if a number is happy from another script, I choosed to write a function which will tell us if a number is happy (True) or unhappy (False).

As I also want to see the sequence of the results of calculation() when I will work with this function, I introduced between everything a switch which if True will give us a tuple of (True, [results of calculation()]) if the number is happy and (False, [results of calculation()]) if the number is unhappy.

Finally please note that as I did not want to wait for an endless loop, I choosed to check if the result of calculation() was already in our list of results. If it is the case then we have an unhappy number.

For this part I will use:

def is_happy(number, return_sequence=False):
    """
    This function check if a number is happy or not.

    Argument:
        - number: int
            The number to check.
        - return_sequence: bool
            If True we return the sequence of results.

    Returns: bool or tuple
        - True: number is happy.
        - False: number is unhappy.
        - tuple: if return_sequence == True
            - We return (True|False, past_results)
    """

    # This will save the list of previous or past results.
    past_results = []

    list_of_digits = split_digits(number)

    # I choosed to to an endless loop because we do not know
    # where we are going and which path to choose.
    while True:
        current_result = calculation(list_of_digits)

        if current_result not in past_results:
            if current_result != 1:
                list_of_digits = split_digits(current_result)
                past_results.append(current_result)
            elif return_sequence:
                return (True, past_results)
            return True
        elif return_sequence:
            return (False, past_results)
        return False

What if we want to run the scripts ?

Well, to run it as a script with for example python happy_number.py I added the following to the script.

Please note the usage of if __name__ == '__main__': which avoid the script running when we are exporting for example is_happy() for another script or module.

Is this part I will use:

if __name__ == '__main__':
    NUMBER = given_number()

    if is_happy(NUMBER):
        print('%d is a happy number' % NUMBER)
    else:
        print('%d is an unhappy number' % NUMBER)

Final script

Rose.