Let me talk a bit about my latest little project I wrote in Python 3.

Problematic

As I'm part of the team behind Ultimate Host Blacklist along with other similar projects, we often encounter domains which are flagged as INVALID by PyFunceble.

So how can we convert each domain of the INVALID.txt generated by PyFunceble to IDNA format so we can reintroduce them for testing? My response? domain2idna.

Understanding Punycode and IDNA

Before continuing to read, I'll invite you to read the following from charset.com which explain Punycode/IDNA:

Punycode is an encoding syntax by which a Unicode (UTF-8) string of characters can be translated into the basic ASCII-characters permitted in network host names. Punycode is used for internationalized domain names, in short IDN or IDNA (Internationalizing Domain Names in Applications).

For example, when you would type café.com in your browser, your browser (which is the IDNA-enabled application) first converts the string to Punycode "xn--caf-dma.com", because the character 'é' is not allowed in regular domain names. Punycode domains won't work in very old browsers (Internet Explorer 6 and earlier).

Find more detailed info in the specification RFC 3492.

With another example, a domain like lifehacĸer.com (note the K) is actually translated to xn--lifehacer-1rb.com. You may not encounter those kinds of domains in your daily navigation over the Internet but when coming to hosts file, we encounter them almost everywhere.

Indeed, today IDNA formatted domain are mostly used for phishing like this hacker news article which explain a bit deeper the danger about IDNA.

About domain2idna

Domain2idna can be found on GitHub and is ready to use!

It can be used in two different ways: As an imported module or As a command-line command.

As an imported module

As Python allow an installed module to be imported here is an example of how to use domain2idna into an existing code or infrastructure.

#!/usr/bin/env python3

"""
This module uses domains2idna to convert a given domain.

Author:
    Nissar Chababy, @funilrys, contactTATAfunilrysTODTODcom

Contributors:
    Let's contribute to this example!!

Repository:
    https://github.com/funilrys/domain2idna
"""

from colorama import Style
from colorama import init as initiate

from domain2idna.core import Core

DOMAINS = [
    "bittréẋ.com", "bịllogram.com", "coinbȧse.com", "cryptopiạ.com", "cṙyptopia.com"
]

# We activate the automatical reset of string formatting
initiate(True)

# The following return the result of the whole loop.
print(
    "%sList of converted domains:%s %s"
    % (Style.BRIGHT, Style.RESET_ALL, Core(DOMAINS).to_idna())
)

# The following return the result of only one element.
print(
    "%sString representing a converted domain:%s %s"
    % (Style.BRIGHT, Style.RESET_ALL, Core(DOMAINS[-1]).to_idna())
)

That is a simple example to understand how the domain2idna works.

As you can note, domains2idna can return two type: a list or a str. Indeed, because I'll mostly use domain2idna to convert big lists, I wrote domain2idna so it can handle a given list and return a list with the converted domains. In the other side, as most people will want to get the IDNA format of only a domain, domain2idna also return an str if a string is given as input.

As a command-line

This part is less "interesting" but you may find the following usage which explains greatly how it's working.

usage: domain2idna [-h] [-d DOMAIN] [-f FILE] [-o OUTPUT]

domain2idna - A tool to convert a domain or a file with a list of domain to
the famous IDNA format.

optional arguments:
-h, --help            show this help message and exit
-d DOMAIN, --domain DOMAIN
                    Set the domain to convert.
-f FILE, --file FILE  Set the domain to convert.
-o OUTPUT, --output OUTPUT
                    Set the file where we write the converted domain(s).

Crafted with ♥ by Nissar Chababy (Funilrys)

As the conclusion, it was fun to write that little project and I hope that it'll help the Open-Source community!

That's it for the presentation of the project! A detailed code comment/explanation may come soon on the programming section.

Thanks for reading.

Let's talk about PyFunceble, a tool to check the availability of a domain, an IPv4 or a list of domain or IPv4.


You know Funceble ?

Well, that's awesome because you then already know PyFunceble. Test PyFunceble and let me know what you think about it on Twitter (with #PyFunceble) or GitHub!


The main idea behind PyFunceble is to take Funceble to a next level. Indeed, Funceble was and is still great, but as many people mentioned when I released it, " it's written in Shell" which is not available on every machine. So the reason Funceble was written in Shell is that when I started to write it months ago, I wanted to write something helpful but I also wanted to improve my Shell skills. So I decided to write it in Shell.

Today, things are different because Funceble is for sure used but only on Unix based systems which made me think about "What if other systems could use Funceble ?". At the time I wrote Funceble, I also knew about Python but I never had the time and the desire to improve the skills I gained out there. But, as Python 3 is portable so available for download or already installed on almost all modern machines, I decided that it was time to improve my skills and to rewrite Funceble.

That was the beginning of PyFunceble. I had some great and bad time developing FyFunceble to its current state but it was worth when I see the result. I did many improvements into the way the algorithm of Funceble should be structured. I also had some fun time and I learned a lot about Python. That does a great resumé of my time writing PyFunceble. I also added one feature which I don't know how to implement yet into Funceble. That feature is behind inactive-db.json which can be seen in the repository. Indeed, I had this discussion one day and I ended with the idea of creating a database of all inactive domain so they can be tested over the time as the content of the database is automatically added to the list of domain to test when we retest the same filename in the next day.

Today PyFunceble is ready to be released into its first version but it's not gonna be done. Indeed, I don't want to release the first version of PyFunceble yet because that would mean that Funceble is becoming obsolete. But it's not the case ! That's why the first version of PyFunceble will be released in the same time as Funceble 2.0.

More information can complementarily be found on the Wiki of PyFunceble.

Representation of the logic

More information can complementarily be found on the Wiki of PyFunceble.


Analog questions

Will I write Funceble into another language which is not Python and Bash?

It's not planned but if someone wants to start rewriting it into its favorite language I'll be glad to help!

Will I stop developing Funceble in the profit of PyFunceble?

I can tell, I really don't think about that yet but let's see where the future will take us.

Today I start a new adventure with funceble, a new script to check domains or IP availability.

Funceble is now described as:

[an] excellent script for checking ACTIVE, INACTIVE and EXPIRED domain names.

Where does the idea come from?

The idea of the script came under a discussion around the famous Steven Black's hosts file(s). Indeed, the hosts file is really great but the main problem is that the there's probably many domains that do not exist anymore.

So I planned to write funceble.

An explanation of the script ?

A full releases changelog can be found here.


Special Thanks

Thank you for your awesome hosts lists, support or contributions which helped (and/or still help) me build this script. :smile: :+1:

Contributors

Thank you for your awesome ideas or contributions which make or made funceble better!! :+1: :100: :1st_place_medal:


Supporting the project

Funceble and dead-hosts are powered by :coffee:!

This project helps you and or you like it? Why don't you Buy me a cup of coffee? :smile_cat: